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NEW SEQUENCES OF HEPATITIS C VIRUS GENOTYPES AND THEIR USE AS 
THERAPEUTIC AND DIAGNOSTIC AGENTS 



The invention relates to new sequences of hepatitis C virus (HC\') genotypes and their use 
as therapeutic and diagnostic agents. 

The present invention relates to new nucleotide and amino acid sequences correspondine 
to the coding region of a new type 2 subtype 2d, type-specific sequences corresponding co 
HCV type 3a, to new sequences corresponding to the coding region of a new subtype 3c, and 
to new sequenres corresponding to the coding region of HCV type 4 and type 5 subtype 5a; 
a process for preparing diem, and their use for diagnosis, prophylaxis and therapy. 

The technical problem underlying the present invention is :o provide new rvT)e-specir~c 
sequences of the Core, the El. the E2, the NS3, the NS4 and the NS5 regions of HCV t\Tie 
4 and type 5, as well as of new variants of HCV types 2 and 5. Tnese new HCV sequences 
are useful to diagnose die presence of type 2 and/or type 3 and;or r>'pe 4 and/or type 5 HCV 
genotypes in a biological sample. Moreover, the availabilit\' of these aew t^/pe-specirlc 
sequences can increase the overall sensitivity of HCV detection and should also prove to be 
useful for therapeutic purposes. 

Hepatitis C viruses (HCV) have been found to be the major cause of non-A. non-B 
hepatitis. The sequences of cDNA clones covering the complete genome of several Drotot>pe 
isolates have been determined (Kato et al., 1990; Choo et al., 1991; Okamoto et al., 1991: 
Okamoto et al., 1992). Comparison of these isolates shows diat die variability in nucleotide 
sequences can be used to distinguish at least 2 different genocN'pes, type 1 (HCV-1 and HCV- 
J) and type 2 (HC-J6 and HC-J8), widi an average homology of about 63 ?b. Within each 
type, at least two subtypes exist (e.g. represented by HCV-1 and HCV-J), having a:: average 
homology of about 79%. HCV genomes belonging to me same subtype shov. average 
homologies of more dian 90% (Okamoto et al., 1992). However, the panial nucleotide 
sequence of the NS5 region of die HCV-T isolates showed at most 67% homology with the 
previously published sequences, indicating the existence of a yet another HCV t\-pe (Mori ec 
al., 1992). Parts of the 5' unu-anslated region (UR), core, NS3, and NS5 regions of this rv'pe 
3 have been published, further establishmg the similar evolutionary distances bet^-een the 3 
major genotypes and their subtypes (Chan et al., 1992). 

The identification of type 3 genotypes in clinical samples can be achieved by means of 
PCR with type-specific primers for die NS5 region. However, the degree to which this will 



SUBSTITUTE SHEET (RULE 26) 



ti ih; ^Si^i if:i^ ^'^ 3' -I ^i"^^ 

wo 94/25601 ^ ^ . PCT/EP94/01323 

2 

be successful is largely dependent on sequence variability and on the virus titer present in the 
serum. Therefore, routine PGR in the open reading frame, especially for type 3 and die new 
type 4 and 5 described in the present invention and/or group V (Cha ei al., 1992) genotypes 
can be predicted to be unsuccessful. A new typing system (LiPA.), based on variation in the 
highly conserved 5' UR, proved :o be more useful because the 5 major HCV genotypes ana 
their subtypes can be determined (Srayver et al., 1993). The selection of high-titer isolates 
enables to obtain PGR fragments for cloning widi only 2 primers, while nested PGR requires 
that 4 primers match the unknown sequences of die new type 3, 4 and 5 genotypes. 

New sequences of the 5' untranslated region (5'UR) have been listed by Bukh et al. 
(1992). For some of diese, the El region has recently been described (Bukh'et ai., 1993). 
Isolates with similar sequences in the 5'UR to a group of isolates including DK12 and HKIO 
described by Bukh et al. (1992) and E-bl to E-b8 described and classified as type 3 by Chan 
et al. (1991), have been reported and described in die 5'UR, the carboxyiermmal part of El. 
and in die NS5 region as group W by Cha et al. (1992; WO 92/19743), and have also been 
described in the 5'UR for isolate BR56 and classirled as type 3 by the inventors of this 
application (Smyver et al., 1993). 

The aim of the present invention is to provide new HCV nucleotide and amino acid 
sequences enabliiig the detection of HCV infection. 

Another aim of the present infection is to provide new nucleotide and amino acid HCV 
sequences enabling the classification of infected biological fluids into different serological 
groups unambiguously linked to types and subtypes at die genome level. 

Anodier aim of the present invention is to provide new nucleodde and amino acid HCV 
sequences ameliorating the overall HCV detection rate, 

Anodier aim of the present invention is to provide new HCV sequences, useful for the 
design of HCV vaccine compositions. 

Anodier aim of the present invention is to provide a pharmaceutical composition consistmg 
of antibodies raised against die polypeptides encoded by diese new' HCV sequences, for 
therapy or diagnosis. 

The present invention relates more particularly to a composition comprising or consisting 
of at least one polynucleic acid containing at least 5, and preferably 8 or more contiguous 
nucleotides selected from at least one of the following HCV sequences: 
- an HCV type 3 genomic sequence, more particularly in any of the following 
regions: 



SUBSTlTlfTE SHEET (RULE 26) 



ir-i] fkil! It.-;,. trtT '.^r r irT-r -ii s'l"., M'lf j-'-^j 

WO 94/25601 ' PCT/EP94/0O23 

3 

the region spanning positions 417 to 957 of the Core/El region of HCV 
subtype 3a, 

the region spanning positions 4664 to 4730 of the NS3 region of HCV type 
3^ 

the region spanning positions 4892 to 5292 of the NS3/4 region of HCV 
type 3, 

the region spanning positions 8023 to 8235 of the NS5 region of the BR36 
subgroup of HCV subtype 3a, 
an HCV subtype 3c genomic sequence, 
more particularly the coding regions of the above-specified regions; 

- an HCV subtype 2d genomic sequence, more panicuiarly the coding region of HCV 
subtype 2d; 

- an HCV type 4 genomic sequence, more particularly the coding region, more panicuiarly 
the coding region of subc\'pes 4a, 4e, 4f, 4g, 4h. 4i, and 4j. 

- an HCV type 5 genomic sequence, more panicuiarly the coding region of HCV t^pe 5, 
more panicuiarly die regions encoding Core, EI, E2, NS3, and NS4 

with said nucleotide numberiQg being with respect to die numbering of HCV nucleic acids 
as shown in Table 1, and widi said polynucleic acids containing at leas: one nucleotide 
difference widi known HCV (type 1, type 2, and t^ri^e 3) polynucleic acid sequences in the 
above-indicated regions, or the complement thereof. 

It is to be noted that the nucleotide difference in the polynucleic acids of the invention may 
involve or not an amino acid difference in the correspondlag amino acid sequences coded by 
said polynucleic acids. 

I According to a preferred embodiment, the present invention relates to a composition 
comprising or containing at least one polynucleic acid encoding an HCV polyprotein, widi 
said polynucleic acid containing at least 5, preferably at least 8 nucleotides corresponding to 
at least part of an HCV nucleotide sequence encoding an HCV polyprotein, and with said 
HCV polyprotein containing in its sequence at least one of the following amino acid residues: 
L7, Q43, M44, S60, R67, Q70, T7I, A79, A87, N106, K115, A127, A190, SI30, V134, 
Gi42, 1144, E152, A157, V158, Pi65, SI77 or Y177, 1178, VlSOor E180 or F182, R184, 
1186, H187, T189, A190, S19I or 0191, Q192 or L192 or 1192 or V192 or E192, N193 or 
H193 or P193. W194 or YI94, H195, A197 or II97 or V197 or T197, V202, 1203 or L203. 
Q208, A210. V212, F214, T216, R217 or D217 or E217 or V217, H218 or N218, H2I9 or 
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V219 or L219. L227 or 1227. M231 or E231 or Q231, T232 or D232 or A232 or K232. 
Q235 or 1235, A237 or T237, 1242, 1246, S247, S248, V249. S250 or Y250, 1251 or V251 
or M251 or F251. D252, T254 or V254, L255 or V255. E256 or A256, M258 or F258 or 
V258, A260 or Q260 or SZ6Q, A261, T264 or Y26^, M255. 1266 or A266, A267, C268 or 
T268, F271 or NC71 or V271, 1277, M280 or H280. 1284 or A284 or L34, V274. V291, 
N292 or S292, R293 or 1293 or Y293. Q294 or R294. L297 or 097 or Q297, A299 or K299 
or Q299, N303 otT303, T308 or L308, T310 or F310 or A3 10 or D3I0 or V310. L313. 
G517 or Q317. U33, S351. A358, A359. A363, S364, A366, T369. L373. F376. Q386, 
1387, S392, 1399. F402, 1403, R405. D454, A461, A463, T464. K4S4. Q500. E501, S521. 
K522. H524, N528, S53I. S532, V534. F536, F537, M539. 1546, C1282, Ai283, K13I0. 
VI3I2. Q1321, P1368. V1372, V1373, K1405, Q1406. S1409. A1424. A1429. C1435. 
S1436, S1456, H1496, A1504, D1510, D1529, 11543. N1567, D1556, N1567. M1572. 
Q1579, LI58i. S1583, Fi535, V1595, EI606 or T1606. M1611, VI612 or L1612. P163G, 
C1636, P165I. T1656 or 11656, L1663. V1667, V1677, A1681, HI685. E1687, G1689, 
V1695, A1700, Q1704, YI705, AI713, A1714 or S1714. Mi718, D1719. A1721 orT172I, 
R1722, A1723 or V1723, H1726 or G1726, E1730, V1732, F1735, 11736, S1737, RI735, 
T1739, GI740, Q1741, K1742, Q1743, A1744, T1745. L1746, E1747 or K1747, 11749. 
A1750, T1751 or A1751. V1753, N1755, K1756, A1757, P1758. A1759, H1762, T1763, 
Y1764, P2645, A2647, K2650. K2653 or L2653, S2664. N2673. F2680. K2681, L2686. 
H2692, Q2695 orL2695 or 12695. V2712, F2715, V2719 or Q2719, T2722, T2724, S2725. 
R2726, G2729, Y2735, H2739, 12748, G2746 or 12746, 12748, P2752 or K2752, P2754 or 
T2754, T2757 or P2757, with said notation being composed of a letter representing the amino 
acid residue by its one-letter code, and a number representing the amino acid numbering 
according to Kato et al... 1990. 

Each of the above-mentioned residues can be found in any of Figures 2, 5. 7, 11 or 12 
showing the new amino acid sequences of the present invention aligned with known sequences 
of other types or subtypes of HCV for the Core, El, E2, NS3, NS4, and NS5 regions. 

More particularly, a polynucleic acid contained in the composition accord'mg to the present 
invention contains at least 5, preferably 8, or more contiguous nucleotides corresponding to 
a sequence of contiguous nucleotides selected from at least one of HCV sequences encoding 
the following new HCV amino acid sequences: 

- new sequences spanning amino acid positions 1 to 319 of the Core/El region of HC\ 
subtype 2d, type 3 (more particularly new sequences for subtypes 3a and 3c), new type 4 
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subtypes (more panicularly new sequences for subtypes 4a, 4e, 4f, 4g, 4h, 4i and 4j) and 
type 5a, as shown in Figure 5; 

- new sequences spanning amino acid positions 328 to 546 of the E1/E2 region of HCV 
subtype 5a as shown in Figiire 12; 

- new sequences spanning amino acid positions 1556 to 1764 of the NS3/NS4 region of 
HCV type 3 (more particularly for new subtypes 3a sequences), and subtype 5a, as shown 
in Figure 7 or II; 

- new sequences spanning amino acid positions 2645 to 2757 of the NS5B region of HCV 
subtype 2d, type 3 (more panicularly for new subtypes 3a and 3c), new type 4 subtypes 
(more particularly subtypes 4a, 4e, 4f, 4g, 4h, 4i and 4j) and subtype 5a, -35 shown in 
Figure 2, 

Usmg the LiPA system mentioned above, Brazilian blood donors with high titer type 3 
hepatitis C virus, Gabonese patients with high-titer type 4 hepatitis C virus, and a Belgian 
patient with high-titer HCV t>T3e 5 infection were selected. Nucleotide sequences in the core, 
El, NS5 and NS4 regions which have not yet been reported before, were analyzed m the 
frame of the invention. Coding sequences (widi the exception of the core region) of any type 
4 isolate are reponed for the first tmie in the present invention. The NS5b region was also 
analyzed for the new type 3 isolates. After havmg determined the NS5b sequences, 
comparison with the Ta and To subtypes described by Mori e: al. (1992) was possible, and 
the rype 3 sequences could be identified as t>'pe 3a genor.pes. The new type 4 isolates 
segregated into 10 subtypes, based on homologies obtained in the NS5 and El regions. New 
type 2 and 3 sequences could also be distinguished from previously described type 2 or 3 
subtypes from sera collected in Belgium and die Netherlands. 

The term "polynucleic acid'' refers to a single stranded or double stranded nucleic acid 
sequence which may contain at least 5 contiguous nucleotides to die complete nucleotide 
sequence (f.i, at least 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 or more contiguous nucleotides). A 
polynucleic acid which is up till about 100 nucleotides in length is often also referred to as 
an oligonucleotide. A polynucleic acid may consist of deoxyribonucleotides or 
ribonucleotides, nucleotide analogues or modirled nucleotides, or may have been adapted for 
therapeutic purposes. A polynucleic acid may also comprise a double stranded cDNA clone 
which cam be used for cloning purposes, or for in vivo therapy, or prophylaxis. 

The term "polynucleic acid composition" refers to any kind of composition comprising 
essentially said polynucleic acids. Said composition may be of a diagnostic or a therapeutic 
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nature. 

The expression ^nucieotides corresponding 10** refers to nudeotides which are homologous 
or complementary to an indicated nucleotide sequence or region within a specific HCV 
sequence. 

The term "coding region" corresponds to the region of the HCV genome that encodes the 
HCV polyprotein. In fact, it comprises the complete genome with the exception of d:e 5' 
untranslated region and 3* untranslated region. 

Tne term "HCV polyprotein"* refers to the HCV polyprotem of the HCV- J isolate (Kato 
et al,, 1990), The adenine residue at position 330 (Kato et al., 1990) is the first residue of 
the ATG codon that initiates the long HCV polyprotein of 3010 amino acids in HCV-J and 
other type lb isolates, and of 3011 amino acids in HCV-1 and other type la isolates, and of 
3033 amino acids in type 2 isolates HC-J6 and HC-J8 (Okamoto et al., 1992), 

This adenine is designated as position 1 at the nucleic acid level, and this mediioaine is 
designated as position 1 at the amino acid level, in the present invention. As type la isolates 
contain I extra amino acid in the NS5a region, coding sequences of type la and lb have 
identical numbering in the Core, El , NS3, and NS4 region, but will differ in the NS5b region 
as indicated in Table 1, Type 2 isolates have 4 extra amino acids in the E2 region, and 17 
or 18 extra amino acids in 

the NS5 region compared to type 1 isolates, and will differ in numbering from type 1 isolates 
in the NS3/4 region and NS5b regions as indicated in Table I. 
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TABLE 1 





Region 


Positions 
described in 
the 

present 
invention* 


Positions 
described fcr 
HCV-J 
(Kato er al., 
1990) 


Positions 
described for 
HCV-l 
(Choc et al., 
1991) 


Posiuooi 
described fcr 
HC-J6, HC-J8 
(Okaraoto e: 
al, 1992) 


Nucleotide 
s 


NS5b 


8023/8235 
7932/8271 


8352/8564 
8261/8600 


8026/8238 
7935/8274 


8433/8645 
8342/8681 




NS3/4 


4664/5292 
4664/4730 
4892/5292 
3856/4209 
4936/5292 


4993 '5621 
4993/5059 
5221/5621 
4185 4528 
5265 5621 


4664/5292 
4664/4730 
4892/5292 
3856/4209 
4936/5292 


501X5645 
5017/5083 
5245; 56^5 
4209 -^762 
5289 56-5 






coding 
region 
of present 
invention 


330. 9359 


1/9033 


342.'9439 


Amino 
Acids 


NS5b 


2675/2745 
2645/2757 


2675/27^5 
26-5. '2757 


2676/2746 
2646/2758 


2698.276S 
2668/2780 




NS3/4 


1556/1764 
1286/1403 
1646/1764 


155a'1764 
12S6' 1-^03 
16-i6' He- 


1556/1764 
1286/1403 
1646/1764 


1560/1768 
1290/140" 
1650/1768 



'^^ble 1 Comparison of the HCV nucleotide and amino acid numbering system used in the 
present invention (*) wiLh the numbering used for other prototype isolates For 
example, 8352/8564 indicates the region designated by the numbering from 
nucleotide 8352 to nucleotide 856- as described by Kato et al (1990) Since the 
numbering system of the present invenuon stans at the polyprotem initia-.ion site, 
the 329 nucleotides of the 5' untranslated region described by Kato et al (1990) 
have to be substracted, and the ccrrespondlng region is numbered from nucleotide 
8023 ("8352-329") to 8235 ("8564-329") 
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The term "HCV type" corresponds to a group of HCV isolates of wbuch the complete 
genome shows more than 74% homology at the aucleic acid level, or of which the NS5 region 
between nucleotide positions 7932 and 8271 shows more than 74% homology at the aucleic 
acid level, or of which die complete HCV poiyprotetn shows more than 7S% homology at the 
amino acid level, or of which die NS5 region between amino acids at positions 2645 and 275T* 
shows more than 80% homology at the amino acid level, to polyproteins of ihe other isolates 
of the group, with said numbering beginning at the first ATG codon or first methionine of the 
long HCV polyprotein of the HCV-J isolate (Kato et al., 1990). Isolates belonging to different 
types of HCV exhibit homologies, over the complete genome, of less than 74% at the nucleic 
acid level and less than 78% at the amino acid level. Isolates belonging to "th^ same type 
usually show homologies of about 92 to 95% at the nucleic acid level and 95 to 96% at the 
amino acid level when belonging to the same subtype, and those belonging to the same type 
but different subtypes preferably show homologies of about 79° o at the nucleic acid level and 
85-36% at the amino acid level. 

More preferably the defmition of HCV types is concluded from the classification of HCV 
isolates according to their nucleotide distances calculated as detailed below 

(1) based on phylogenetic analysis of nucleic acid sequences in the NS5b region between 
nucleotides 793 5 and 8274 (Choo e: al., 1991) or S261 and 8600 (Kato et al., 1990) or S342 
and 8681 (Okamoto et al., 1991), isolates belonging to the same HCV t>Tie show nucleotide 
distances of less than 0.34, usually less than 0.33, and more usually of less than 0 32, and 
isolates belonging to the same subt\'pe show nucleotide distances of less than 0 135, usually 
of less than 0 13, and more usually of less than 0.125, and consequently isolates belonging to 
the same type but different subtypes show nucleotide distances ranging from 0 135 to 0.34, 
usually ranging from 0.1384 to 0.2-77, and more usually ranging from 0.15 to 0 32, and 
isolates belonging to different HCV r\'pes show nucleotide distances greater than 0 34, usually 
greater that 0.35, and more usually of greater than 0.358, more usually ranging from 0 1384 
to 0.2977. 

(2) based on phylogenetic analysis of nucleic acid sequences in die core/El region between 
nucleotides 378 and 957, isolates belonging to the same HCV type show nucleotide distances 
of less than 0 38, usually of less than 0.37, and more usually of less than 0 364, and isolates 
belonging to the same subtype show nucleotide distances of less ±an 0.17, usually of less than 
0.16, and more usually of less than 0 15, more usually less than 0 135, more usually less than 
0 134, and consequendy isolates belonging to the same t>'pe but dlfFerent subtypes show 
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nucleotide distances ranging from 0.15 to 0.38, usually ranging from 0.16 to 0.37, and more 
usually ranging from 0.17 to 0.36, more usually ranging from 0.133 to 0.379, and isolates 
belonging to different HCV types shoAv nucleotide distances greater than 0 34, 0.35, 0.36, 
usually more than 0.365, and more usually of greater than 0 37, 

(3) based on phylogenetic analysis of nucleic acid sequences in the NS3/NS4 region 
between nucleotides 4664 and 5292 (Choo et al., 1991) or between nucleotides 4993 and 5621 
(Kato et al., 1990) or between nucleotides 5017 and 5645 (Okamoto et aL, 1991), isolates 
belonging to the same HCV type show nucleotide distances of less than 0.35, usually of less 
than 0.34, and more usually of less than 0.33, and isolates belonging to the same subtype show 
nucleotide distances of less than 0. 19, usually of less than 0. 18, and more usually of less than 
0.17, and consequently isolates belonging to the same type but different subtypes show 
nucleotide distances ranging from 0 17 to 0.35, usually ranging from 0,18 to 0 34, and more 
usually ranging from 0.19 to 0.33, and isolates belonging to different HCV types show 
nucleotide distances greater than 0 33, usually greater than 0.34, and more usually of greater 
than 0-35. 

Table 2 ' Molecular evolutionary distances 



Regioa 


Core/El 


El 


NS5B 


NS5B 




579 bp 


384 bp 


340 bp 


222 bp 


Isolates' 


0.0017 - 0.1347 


0.0026 - 0 2031 


0.0003 -0-1151 


0 000 - 0.1323 




(0.0750 ± 0.0245) 


(0.0969 ^ 0 0239) 


(0 0637 ± 0.0229) 


(0 0607 ± 0.0205) 


Subtypes* 


0.1330 - 0.3794 


0,1645 - 0.4869 


0.1384 -0.2977 


0 117 - 0.353S 




(0.2786 jr 0.0363) 


(0.3761 ± 0.0433) 


(0.2219^0.0341) 


(0 2391 ±0 0399) 


Types' 


0.3479 - 0.6306 


0.4309 - 0.9561 


0.3581 -0.6670 


0.3457 - 0 74Ti 




(0.4703 ± 0.0525) 


(0.6308 ± 0.092S) 


(0 4994 ± 0.0495) 


(0.5295 ± 0.0627) 



* Figures created by the PHYLIP program DNADIST are expressed as minimum to 
maximum (average ±_ standard deviation). Phylogenetic distances for isolates belonging 
to the same subtype ('isolates'), to different subtypes of the same type ('subc>'pes'), 2J\d 
to different types ('types') are given. 

In a comparative phylogenetic analysis of available sequences, ranges of molecula: 
evolutlonary distances for different regions of the genome were calculated, based on 19,781 
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pairwise comparisons by means of the DNA DIST program of the phylogeny inference 
package PHYLIP version 3.5C (Felsenstein, 1993). The results are shown in Tabie 2 and 
indicate that although the majority of distances obcained in each region fu with classification 
of a certain isolate, only the ranges obtained in ± . 240bp NSSB-region are non-overlapping 
and therefor conclusive. However, as was penormed in the present invention, it is preferabifc' 
to obtain sequence information from at least 2 regions before fmal classification of a given 
isolate. 

Designation of a number to the different types of HCV and HCV types aomenclacure is 
based on chronological discovery of the different types. The aumbering system used m the 
present invention might still fluctuate according to international conventions or guidelines. For 
example, "type 4" might be changed into "type 5" or "type 6". 

The term "subtype" corresponds to a group of HCV isolates of which the complete 
polyprotein shows a homology of more than 90% both at the nucleic acid and ammo acid 
levels, or of which the NS5 region between nucleotide positions 7932 and 8271 shows a 
homology of more than 90% at the nucleic acid level to the corresponding parts of the 
genomes of the other isolates of the same group, with said aumbering beginning with the 
adenine residue of the initiation codon of the HCV polyprotein. Isolates belonging to the same 
type but different subtypes of HCV show homologies of more than 74% at the nucleic acid 
level and of more than 78% at the amino acid level. 

The term "BRSe subgroup" refers to a group of type 3a HCV isolates (BR56, BR33, 
BR34) that are 95 %, preferably 95.5 %, most preferably 96 % homologous to the sequences 
as represented in SEQ ID NO I, 3, 5, 7, 9, 11 in the NS5b region from position 8023 to 
8235, 

It is to be understood that extremely variable regions like the El, E2 and NS4 regions will 
exhibit lower homologies than the average homology of the complete genome or the 
polyprotein. 

Using these criteria, HCV isolates can be classified into at least 6 types. Several subcv^es 
can clearly be distinguished in types 1, 2, 3 and 4 : la, lb, 2a, 2b, 2c, 2d, 3a, 3b, 4a, 4b, 
4c, 4d, 4e, 4f, 4g, 4h, 4i and 4j based on homologies of the 5' UR and coding regions 
including the par: of NS5 between positions 7932 and 8271. An overview of most of the 
reported isolates and their proposed classification according to the typing system of the 
present invention as well as other proposed classifications is presented in Table 3, 
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Table 3 

HCV CLASSIFICATION 

OfLV MORI NAKA CHA PROTOTYPE 
MOTO 0 

la I I Pt GI HCV-1. HCV-H, HC-Jl 

lb II n KI on HCV-J, HCV-BK, HCV-T, HC-JKl. HC 

J4, HCV-CHINA 

Ic HC-G9 

2a ni m K2a Gin HC-J6 

2b IV rv K2b cm HC-J8 

2c S83. ARG6, ARG3. 110, T985 

2d 

3a V V K3 GIV E-bl, Ta. BR36, BR33, HDIO. NZLl 

3b m K3 GTV HCV-TR, To 

3c BE98 

4a Z4. GB809-t 

4b Zl 

4c GB116, G3353. G3215, Z6, Z7 

4<1 DK13 

4e GBS09-:, C.\.M600, CAM736 

4f CAM622, CAM627 

4g GB549 

4h GB438 

4i CAR4/1205 

4j CAR1/50I 

4k EG29 

5a GV SA3. SA4, SAU SA7. SAll. BE95 

6a HKK HK2. HK3, HK4 
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The term "cotnplemeat" refers to a nucleotide sequence which is complementary to an 
indicated sequence and which is able to hybridize to the indicated sequences. 

The composition of the invention can comprise many combinations. By way of example, 
the composition of the invention can comprise: 

- two (or more) nucleic acids from the same region or, 

- two nucleic acids (or more), respectively from different regions, for ±e same isolate or 
for different isolates, 

- or nucleic acids from the same regions and from at least two different regions (for the 
same isolate or for different isolates). 

The present invention relates more panicularly to a polynucleic acid composition as detlned 
above, wherein said polynucleic acid corresponds to a nucleotide sequence selected from any 
of the following HCV type 3 genomic sequences: 

- an HCV genomic sequence having a homology of at least 67 % . preferably more than 69 7o , 
more preferably 71 % , even more preferably more than 735o , or most preferably more than 
76% to any of the sequences as represented in SEQ ID NO 13, 15, 17, 19, 21, 23, 25 or 
27 (HDIO, BR36 or BR33 sequences) in the region spanning positions 417 to 957 of the 
Core/El region as shown in Figure 4; 

- an HCV genomic sequence having a homology of at least 65 %, preferably more than 67% , 
preferably more than 69%, even preferably more than 70%, most preferably more than 
74% to any of the sequences as represented in SEQ ID NO 13, 15, 17, 19, 21, 23, 25 or 
27 (HDIO, BR36 or BR33 sequences) in the region spanning positions 574 to 957 of the 
El region as shown in Figure 4; 

- an HCV genomic sequence as having a homology of at least 79% , more preferably at least 
81 %, most preferably more than 83% or mora to any of the sequences as represented in 
SEQ ID NO 147 (representing positions 1 to 346 of the Core region of HVC t>^e 3c, 
sequence BE98) in the region spanning positions 1 to 378 of the Core region as shown in 
Figure 3; 

- an HCV genomic sequence of HVC type 3a having a homology of at least 74%, more 
preferably at least 76%, most preferably more than 78% or more to any of the sequences 
as represented in SEQ ID NO 13, 15, 17, 19, 21, 23, 25 or 27 (HDIO, BR36 or BR33 
sequences) in the region spanning positions 417 to 957 in the Core/El region as shown in 
Figure 4; 

. - an HCV genomic sequence of HCV r^-pe 3a as having a homology of at least 74%, 
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preferably more than 76%, most preferably 78% or more to any of the sequences as 
represented in SEQ ID NO 13, 15, 17, 19, 21, 23, 25 or 27 (HDIO, 3R36 or BR33 
sequences) in the region spanning positions 574 to 957 in the El region as shown in Figure 
4; 

- an HCV genomic sequence as having a homology of more than 73.5%. preferably more 
than 74%, most preferably 75% homology to the sequence as represented in SEQ ID NO 
29 (HCC153 sequence) in the region spanning positions 4664 to 4730 of the NS3 region 
as shown in figure 6; 

- an HCV genomic sequence having a homology of more than 70%, preferably more than 
72%, most preferably more than 74% homology to any of the sequences as represented 
in SEQ ID NO 29, 31, 33, 35, 37 or 39 (HCC153, HDIO, BR36 sequences) m the resion 
spanning positions 4892 to 5292 in the NS3/NS4 region as shown in Figure 6 or 10; 

- an HCV genomic sequence of die BR36 subgroup of HCV type 3a as having a homology 
of more than 95%, preferably 95,5%, most preferably 96% homology to any of die 
sequences as represented m SEQ ID NO 5, 7, I, 3. 9 or 11 (BR34, BR33. BR36 
sequences) in die region spanmng positions 8023 to 8235 of die NS5 region as shown in 
Figure 1; 

- an HCV genomic sequence of die BRj6 subgroup of HCV type 3a as having a homology 
of more dian 96%, preferably 96.5%, most preferably 97% homology to any of die 
sequences as represented m SEQ ID NO 5, 7, 1, 3, 9 or 11 (BR34. BR33, BR36 
sequences) m die region spanning positions 8023 to 8192 of die NS5B region as show-n in 
Figure 1; 

- an HCV genomic sequence of HCV type 3c being characterized as having a homology of 
more dian 79%, more preferably more dian 81%, and most preferably more dian 83% to 
die sequence as represented in SEQ ID NO 149 (BE98 sequence) in die region spanning 
positions 7932 to 8271 in die NS5B region as shown in Figure 1. 

Preferentially the above-mentioned genomic HCV sequences depict sequences from the 
coding regions of all the above-mentioned sequences. 

According to the nucleotide distance classificanon system (with said nucleotide distances 
being calculated as explained above), said sequences of said composition are selected from: 

- an HCV genomic sequence being characterized as having a nucleotide distance of less than 
0.44, preferably of less dian 0.40. most preferably of less than 0.36 to any of die 
sequences as represented in SEQ ID NO 13, 15, 17, 19, 21, 23, 25 or 27 in die regioa 
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spanning positions 417 to 957 of the Core/El region as shown in Figure 4; 

- an HCV genomic sequence being characterized having a nucleotide distance of less dian 
0-53, preferably less than 0.49, most preferably of less than 0.45 to any of ihe sequences 
as represented in SEQ ID NO 19, 21, 23, 25 or 27 in the region spanning positions 574 
to 957 of the Ei region as shown in Figure 4; 

- an HCV genomic sequence characterized having a nucleotide distance of less than 0.15, 
preferably less than 0.13, and most preferably less than O.ll to any of the sequences as 
represented in SEQ ID NO 147 in the region spanning positions 1 to 378 of the Core 
region as shown in Figure 3; 

- an HCV genomic sequence of HVC type 3a being characterized as having a nucleotide 
distance of less than 0.3, preferably less than 0.26, most preferably of less than 0.22 to 
any of die sequences as represented in SEQ ID NO 13, 15, 17, 19, 21, 23, 25 or 27 m the 
region spanning positions 417 to 957 in the Core/EI region as shown in Fig-jre 4; 

- an HCV genomic sequence of HCV type 3a bemg characterized as having a nucleotide 
distance of less than 0.35, preferably less than 0.31, most preferably of less than 0.27 to 
any of the sequences as represented in SEQ ID NO 13, 15, 17, 19, 21, 23, 25 or 27 m the 
region spanning positions 574 to 957 in the El region as shown in Figure 4; 

- an HCV genomic sequence of the BR36 subgroup of HCV type 3a being characterized as 
having a nucleotide sequence of less than 0.0423, preferably less than 0.042, preferably 
less than 0.0362 to any of the sequences as represented in SEQ ID NO 5, 7, 1 , 3, 9 or II 
in the region spanning positions 8023 to 8235 of the NS5 region as shown m Figure 1; 

- an HCV genomic sequence of HCV type 3c being characterized as having a nucleotide 
distance of less than 0.255, preferably of less than 0.25, more preferably of less than 0.21, 
most preferably of less than 0.17 to the sequence as represented in SEQ ID NO 149 in the 
region spanning positions 7932 to 8271 in the NS5B region as shown in Figure 1. 

In the present application, the El sequences encoding the antigenic ectodomain of the El 
protein, which does not overlap the carboxyterminal signal-anchor sequences of El disclosed 
by Cha et al. (1992; WO 92/19743). in addition to the NS4 epitope region, and a part of the 
NS5 region are disclosed for 4 different isolates: BR33, BR34, BR36, HCC153 and HDIO, 
all belonging to type 3a (SEQ ID NO I, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 
31, 35, 37 or 39). 

Also within the present invention are new subtype 3c sequences (SEQ ID NO 147, 149 of 
the isolate BE98 in the Core and NS5 regions (see Figures 3 and 1). 
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FinaUy the present invention aJso relates to a new subtype 3a sequence as represented in 
SEQ ID NO 217 (see Figure 1) 

AJso included within the present Invention are sequence variants of the polynucleic acids 
as selected from any of the nucleotide sequences as given in any of the above mentioned SEQ 
ID numbers, with said sequence variants containiEg either deletions and/or insertions of one 
or more nucleotides, mainly at the extremities of oligonucleotides (either 3' or 5'), or 
substitutions of some non-essential nucleotides by others (including modified nucleotides an/or 
inosine), for example, a type 1 or 2 sequence cnight be modified into a type 3 sequence by 
replacing some nucleotides of the type 1 or 2 sequence with type-specific nucleotides of type 
3 as shown in Figure 1 (NS5 region), Figure 3 (Core region), Figure 4 (Core/El region). 
Figure 6 and 10 (NS3/NS4 region). 

According to another embodiment, the present mveation relates to a polynucleic acid 
composition as defined above, wherein said polynucleic acids correspond to a nucleotide 
sequence selected from any of the following HCV type 5 genomic sequences: 

- an HCV genomic sequence as having a homology of more than 85 %, preferably more than 
86%, most preferably more than 87% homology to any of the sequences as represented 
in SEQ ID NO 41, 43, 45, 47, 49, 51, 53 (PC sequences) or 151 (BE95 sequence) in the 
region spanning positions 1 to 573 of the Core region as shown in Figure 9 and 3; 

- an HCV genomic sequence as having a homology of more than 61 %, preferably more than 
63%, more preferably more than 65% homology, even more preferably more than 66% 
homology and most preferably more than 67% homology (f.i. 69 and 71%) to any of the 
sequences as represented in SEQ ID NO 41, 43, 45, 47, 49, 51, 53 (PC sequences). 153 
or 155 (BE95, BEIOO sequences) in the regioQ spanning positions 574 to 957 of the El 
region as shown in Figure 4; 

- an HCV genomic sequence having a homology of more than 76.5%. preferably of more 
than 77%, most preferably of more than 78% homology with any of the sequences as 
represented in SEQ ID NO 55, 57, 197 or 199 (PC sequences) in the region spanning 
positions 3856 to 4209 of the NS3 region as shown in Figure 6 or 10; 

- an HCV genomic sequence having a homology of more than 68 % , preferably of more than 
70%, most preferably of more than 72% homology with the sequence as represented in 
SEQ ID NO 157 (BE95 sequence) in the region spanning positions 980 to 1179 of the 
E1/E2 region as shown in Figure 13; 

- an HCV genomic sequence having a homology of more than 57%. preferably more than 
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59%, most preferably more than 61% homology to any of the sequences as represented 
in SEQ ID NO 59 or 61 (PC sequences) in the region spanning positions 4936 to 5296 of 
the NS4 region as shown in Figure 6 or 10; 

- an HCV genomic sequence as having a homology of more than 93 % , preferabiy more iian 
93.5%, most prefer^ly more than 94% homology to any of the sequences as represented 
in SEQ ID NO 159 or 161 (BE95 or BE96 sequences) in the region spanning positions 
7932 to 8271 of the NS5B region as shown in Figure 1. 

Preferentially the above-mentioned genomic HCV sequences depict sequences from the 
coding regions of all the above-mentioned sequences. 

According to the nucleotide distance classification system (with said nucleotide distances 
being calculated as explained above), said sequences of said composition are selected from: 

- a nucleotide distance of less dian 0.53, preferabiy less than 0.5 1 , more preferably less than 
0.49 for die El region to the type 5 sequences depicted above; 

- a nucleotide distance of less than 0.3, preferably less than 0.28, more preferably of less 
than 0.26 for the Core region to the tv-pe 5 sequences depicted above; 

- a nucleotide distance of less than 0.072, preferably less than 0.071, more preferably less 
than 0.070 for the NS5B region to die type 5 sequences as depicted above. 

Isolates with similar sequences in the 5'UR to a group of isolates including SAl, SA3, and 
SA7 described in the 5*UR by Bukh et al. (1992), have been reponed and described in the 
5'UR and NS5 region as group V by Cha ei al. (1992; WO 92/197^^3). This group of isolates 
belongs to type 5a as described in the present invention (SEQ ID NO 41, 43, 45, 47, 49, 51, 
53, 55, 57, 59, 61, 151, 153, 155, 157, 159, 161, 197 and 199). 

Also included within the present invention are sequence variants of die polynucleic acids 
as selected from any of the nucleotide sequences as given in any of the above given SEQ ID 
numbers with said sequence variants containing eidier deletion and/or inseaions of one or 
more nucleotides, mainly at the extremities of oligonucleotides (either 3' or 5'), or 
substitutions of some non-essential nucleotides (i.e. nucleotides not essential to discriminate 
between different genotypes of HCV) by ouiers (including modified nucleotides an/or 
inosine), for example, a type 1 or 2 sequence might be moditled into a type 5 sequence by 
replacing some nucleotides of the type 1 or 2 sequence with rs'pe-specific nucleotides of rype 
5 as shown in Figure 3 (Core region), Figure 4 (Core/El region). Figure 10 (NS3 / NS4 
region). Figure 14 (EI/E2 region). 
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Another group of isolates including BU74 and BU79 having similar sequences in the 5' UR 
to isolates including Z6 and Z7 as described in the 5'UR by Bukh et al. (1992), have been 
described in the 5'UR and classified as a new type 4 by the inventors of this application 
(Stuyver et al., 1993). Coding sequences, including core. El and NS5 sequences of several 
new Gabonese isolates belonging to this group, are disclosed in the present invention (SEQ 
ID NO 106, 108, 110, 112, 114, 116, 118, 120 and 122). 

According to yet another embodiment, the present invention relates to a composition as 
defined above, wherein said poly nucleic acids correspond to a nucleotide sequence selected 
from any of the following HCV type 4 genomic sequences: 

- an HCV genomic sequence having a homology of more than 66%, prefers^bly more than 
68%, most preferably more than 70% homology m the El region spanning positions 57^^ 
to 957 to any of the sequences as represented m SEQ ID NO 118, 120 or 122 (GB358, 
GB549. GB809 sequences) as shown in Figure 4; 

- an HCV genomic sequence having a homology of more than 71%, preferably more than 
72%, most preferably more than 74% homology to any of the sequences as represented 
m SEQ ID NO 1 18, 120 or 122 (GB358, GB549, GB809 sequences) in the region spanning 
positions 379 to 957 of the El region as shown in Figure 4; 

- an HCV genomic sequence having a homology of more than 92%, preferably more than 
93%, most preferably more than 94% homology to any of the sequences as represented 
in SEQ ID NO 163 or 165 (GB809, CAM600 sequences) in the region spanning positions 
1 to 378 of the Core/El region as shown in Figure 4; 

- an HCV genomic sequence (subtype 4c) having a homology of more than 85 % , preferably 
more dian 86% , more preferably more than 86,5% homology, most preferably more than 
87, more than 88 or more than 89% homology to any of ±e sequences as represented in 
SEQ ID NO 183, 185 or 187 (GB116, GB2I5, GB809 sequences) in the region spanning 
positions 379 to 957 of the El region as shown in Figure 4; 

- an HCV genomic sequence (subtype 4a) having a homology of more than 81 % , preferably 
more than 83%, most preferably more than 85% homology to die sequence as represented 
in SEQ ID NO 189 (GB908 sequence) in the region spanning positions 379 to 957 of the 
El region as shown in Figure 4; 

' an HCV genomic sequence (subtype 4e) having a homology of more than 85 %, preferably 
more than 87%, most preferably more than 89% homology to any of the sequences as 
represented in SEQ ID NO 167 or 169 (CAiM600, GB908 sequences) in the region 
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spanning positions 379 to 957 of the El region as shown Lq Figure 4; 

- an HCV genomic sequence (subtype 4f) having a homology of more than 795 , preferably 
more than 81%, most preferably more than 83% homology to any of the sequences as 
represented m SEQ ID NO 171 or 173 (C\MG:2, CA.MG27 sequences) in the region 
spanning positions 379 to 957 of the El region as shown in Figure 4; 

- an HCV genomic sequence (subtype 4g) having a homology of more than 34% » preferably 
more than 86%, most preferably more than 88% homology to the sequence as represented 
in SEQ ID NO 175 (GB549 sequence) in the region spanning positions 379 to 957 of the 
El region as shown in Figure 4; 

- an HCV genomic sequence (subtype 4h) having a homology of more than 83*^, preferably 
more than 85%, most preferably more than 87% homology to the sequence as represented 
in SEQ ID NO 177 (GB438 sequence) in the region spanning positions 379 to 957 of the 
El region as shown in Figure 4; 

- an HCV genomic sequence (subtype 4i) as having a homology of more than 76%, 
preferably more than 78%, most preferably more than 80% homology to the sequence as 
represented in SEQ ID NO 179 (CAR4/1205 sequence) in die region spanning positions 
379 to 957 of the El region as shown in Figure 4; 

- an HCV genomic sequence (subtype 4j?) having a homology of more than 8-% , preferably 
more than 86%, most preferably more than 88% homology to the sequence as represented 
in SEQ ID NO 181 (CAR4/901 sequence) in die region spanning positions 379 to 957 of 
the El region as shown in figure 4; 

- an HCV genomic sequence as having a homology of more than 73 % , preferably more than 
75%, most preferably more than 77% homology to any of the sequences as represented 
in SEQ ID NO 106, 108, 110, 112, 114. or 116 (GB48, GB116, GB215, G3358, GB549, 
GB809 sequences) in the region spanning positions 7932 to 8271 of the NS5 region as 
shown in figure 1; 

- an HCV genomic sequence (subtype 4c) having a homology of more than 88 5c, preferably 
more than 89%, most preferably more than 90% homology to any of die sequences as 
represented in SEQ ID NO 106. 108. 110, or 112 (GB48, GB116. GB215, GB358 
sequences) in the region spanning positions 7932 to 8271 of die NS5 region as shown in 
Figure 1; 

- an HCV genomic sequence (subtype 4e) having a homology of more than 88 % , preferably 
more than 89%, most preferably more dian 90% homology to any of die sequences as 
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represented in SEQ ID NO 1 16 or 201 (GB809 or CAM 600 sequences) in the region 
spanning positions 7932 to 8271 of the NS5 region as shown in Figure 1; 

- an HCV genomic sequence (subtype 4f) having a homology of more than 87% , preferably 
more than 89%, most preferably more than 90^5 homology to the sequence as represented 
in SEQ ID NO 203 (C.\iMG22 sequence) in the region spanning positions 7932 to 8271 
of the NS5 region as shown in Figure I; 

- an HCV genomic sequence (subtype 4g) as having a homology of more than 85%, 
preferably more than 87%, most preferably more than 89% homology lo ±e sequence as 
represented in SEQ ID NO 114 (GB549 sequence) in the region spanning positions 7932 
to 8271 of the NS5 region as shown in Figure 1; 

- an HCV genomic sequence (subtype 4h) as having a homology of more than 86%., 
preferably more than 87%, more preferably more dian 88% homology, more preferably 
more than 89% homology to the sequence as represented in SEQ ID NO 207 (GB457 
sequence) in the region spanning positions 7932 to 8271 of the NS5 region as shown in 
Figure 1; 

- an HCV genomic sequence (subtype 4i) having a homology of more than 84% , preferably 
more than 86%, most preferably more than 88% homology to the sequence as represented 
in SEQ ID NO 209 (CAR4/1205 sequence) in (he region spanning positions 7932 to 8271 
of the NS5 region as shown in figure 1; 

- an HCV genomic sequence (subc\*pe 4j) having a homology of more than 81 % , preferably 
more than 83%, most preferably more than 85% homology to the sequence as represented 
in SEQ ID NO 211 (CARl/501 sequence) in the region spanning positions 7932 to 8271 
of the NS5 region as shown in figure L 

Preferentially the above-mentioned genomic HCV sequences depict sequences from the 
coding regions of all the above-mentioned sequences. 

According to the nucleotide distance classification system (with said nucleotide distances 
being calculated as explained above), said sequences of said composition are selected from: 

- an HCV genomic sequence (type 4) being characterized as having a nucleotide distance of 
less than 0.52, 0.50, 0.4880, 0.46, 0.4-1, 0.43 or most preferably less than 0.42 in the 
region spanning positions 574 to 957 to any of the sequences as represented in SEQ ID NO 
118, 120 or 122 in the region spanning positions 1 to 957 of the Core/El region as shown 
in Figure 4; 

- an HCV genomic sequence (type 4) being characterized as having a nucleotide distance of 
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less than 0.39. 0.36 0.34 0.32 or most preferably less than 0.31 to any of the sequences 
as represented in SEQ ID iNO 1 18, 120 or 122 in the region spanning positions 379 to 957 
of the El region as shown in Fig\ire 4; 

- an HCV genomic sequence (suba^e 4c) being characterized as having a nucleotide distance 
of less than 0.27, 0,26, 0.24, 0.22, 0.20, 0.18, 0. 17, 0, 162, 0. 16 or most preferably less 
than 0.15 to any of the sequences as represented in SEQ ID NO 183, 185 or 187 in the 
region spanning positions 379 to 957 of die El region as shown in Figure 4; 

- an HCV genomic sequence (subtype 4a) being characterized as having a nucleotide distance 
of less than 0.30, 0.28, 0.26, 0.24, 0,22, 0.21 or most preferably of less than 0,205 to die 
sequence as represented m SEQ ID NO 189 in die region spanning positions 379 to 957 
of die El region as shown in Figure 4; 

- an HCV genomic sequence (subc\*pe 4e) being characterized as having a nucleotide distance 
of less dianO.26, 0.25, 0.23, 0.21. 0.19, 0.17, 0,165. most preferably less dian 0.16 to 
any of the sequences as represented in SEQ ID NO 167 or 169 m the region spanning 
positions 379 to 957 of the El region as shown in Figure 4; 

- an HCV genomic sequence (subc>-pe 4f) being characterized as having a nucleotide distance 
of less dian 0.26, 0.24, 0.22, 0.20, 0.18, 0.16, 0.15 or most preferably less than 0.14 to 
any of die sequences as represented in SEQ ID NO 171 or 173 in the region spanning 
positions 379 to 957 of the El region as shown in Figure 4; 

- an HCV genomic sequence (subtype 4g) being characterized as having a nucleotide 
distance of less dian 0.20, 0.19, 0.18, 0.17 or most preferably of less dian 0.16 to die 
sequence as represented in SEQ ID NO 175 in the region spanning positions 379 to 957 
of die El region as shown in Figure 4; 

- an HCV genomic sequence (subtype 4h) being characterized as having a nucleotide 
distance of less dian 0.20, 0.19, 0.18, 0.17 and most preferably of less dian 0.16 to die 
sequence as represented in SEQ ID NO 177 in die region spanning positions 379 to 957 
of die El region as shown in Figure 4; 

' an HCV genomic sequence (subnrpe 4i) being characterized as having a nucleotide distance 
of less dian 0.27. 0.25, 0.23, 0.21 and preferably less dian 0.16 to die sequence as 
represented in SEQ ID NO 179 in die region spanning positions 379 to 957 of die El 
region as shown in Figure 4; 

- an HCV genomic sequence (subtype 4j?) being characterized as having a nucleotide 
distance of less dian 0.19, 0.18, 0.17, 0.165 and most preferably of less ±an 0.16 to die 
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sequence as represented in SEQ ID NO 181 in the region spanning positions 379 to 957 
of the El region as shown in figure 4; 

- an HCV genomic sequence (type 4) being characterized as having a nucleotide distance of 
less than 0.35. 0.34, 0.32 and most preferably of less than 0.30 to any of the sequences 
as represented in SEQ ID NO 106, 108, 110. 112, 114, or 116 in the region spanning 
positions 7932 to 8271 of the NSf region as shown in figure 1; 

- an HCV genomic sequence (subtype 4c) being characterized as having a nucleotide distance 
of less than 0.18, 0.16, 0.14, 0.135, 0.13, 0.1275 or most preferably less than 0.125 to 
any of the sequences as represented in SEQ ID NO 106, 108, 110, or 112 m the region 
spanning positions 7932 to 8271 of the NS5 region as shown in Figure 1; ' - 

- an HCV genomic sequence (subn^e 4e) bemg characterized as having a nucleotide distance 
of less than 0.15, 0.14, 0.135, 0.13 and most preferably of less than 0.125 to any of the 
sequences as represented m SEQ ID NO 116 or 201 m the region spanning positions 7932 
to 8271 of the NS5 region as sho^n in Figure 1; 

- an HCV genomic sequence (subt\-pe 4f) being characterized as having a nucleotide distance 
of less than 0.15, 0.14, 0.135, 0.13 or most preferably less than 0.12o to the sequence as 
represented in SEQ ID NO 203 in the region spanning positions 7932 to 8271 of the NS5 
region as shown in Figure 1; 

- an HCV genomic sequence (subr/pe 4g) being characterized as having a nucleotide 
distance of less than 0.17, 0.16, 0.15, 0.14, 0.13 or most preferably less than 0.125 to the 
sequence as represented in SEQ ID NO 114 in the region spanning positions 7932 to 8271 
of the NS5 region as shown in Figijre 1; 

- an HCV genomic sequence (subtype 4h) being characterized as having a nucleotide 
distance of less than 0.155, 0.15, 0.145, 0.14, 0.135, 0.13 or most preferably less than 
0.125 to the sequence as represented m SEQ ID NO 207 in the region spanning positions 
7932 to 8271 of the NS5 region as shown in Figure 1; 

- an HCV genomic sequence (subtype 4i) being characterized as having" a nucleotide distance 
of less than 0.17, 0.16, 0.15, 0.14, 0.13 or most preferably of less than 0.125 to the 
sequence as represented in SEQ ID NO 209 in the region spanning positions 7932 to 8271 
of the NS5 region as shown in figure 1; 

- an HCV genomic sequence (subtype 4j) being characterized as having a nucleotide distance 
of less than 0.21, 0.20. 0.19, 0.13, 0.17, 0.16, 0.15, 0.14, 0.13 and most preferably of 
less than 0. 125 to the sequejiCft as represented in SEQ ID NO 211 in the region spanning 
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positions 7932 to 8271 of the NS5 region as shown in figure I. 

Also included within the present invention are sequence varianis of the polynucleic acids 
as selected from any of the nucleotide sequences as given in any of the above given SEQ ID 
numbers with said sequence variants containing either deletion and/or insertions of one or 
more nucleotides, mainly at the extremities of oligonucleotides (either 3' or 5'), or 
substitutions of some non-essential nucleotides (i.e. nucleotides not essential to discriminate 
between different genotypes of HCV) by others (including modified nucleotides an/or 
inosine). for example, a type 1 or 2 sequence might be modified into a type 4 sequence by 
replacing some nucleotides of the type 1 or 2 sequence with type-specific nucleotides of type 
4 as shown in Figure 3 (Core region). Figure 4 (Core/El region). Figure ICT <NS3 / NS4 
region). Figure 14 (E1/E2 region). 

The present invention also relates to a sequence as represented m SEQ ID NO 193 (GB72J- 
sequence). 

After aligning NS5 or El sequences of GB48, GB, 116, GB215, GB358, GB549 and 
GB809, these isolates clearly segregated into 3 subtypes within r\^e 4 : GB48, GBI16, 
GB215 and GB358 belong to the sybtype designated 4c, GB549 to subtype 4g and GB809 to 
subtype 4e. In NS5, GB809 (subtype 4e) showed a higher nucleic acids homology to subtype 
4c isolates (85.6 - 86.8%) than to GB549 (subtype 4g, 79.7%), while GB549 showed similar 
homologies to both other subtypes (78.8 to 80% to subtype 4c and 79.7% to subtype 4e). In 
El, subtype 4c showed equal nucleic acid homologies of 75.2% to subtypes 4g and 4e while 
4g and 4e were 78.4% homologous. At the amino acid level however, subtype 4e showed a 
normal homology to subr^/pe 4c (80.2%), while subtype 4g was more homologous to 4c 
(83.3%) and 4e (84.1%). 

According to yet another embodiment, the present invention relates to a composition as 
defmed above, wherein said polynucleic acids correspond to a nucleotide sequence selected 
from any of the following HCV type 2d genomic sequences: 

- an HCV genomic sequence as having a homology of more than 78% , preferabiy more than 
80%, most preferably more than 82% homology to the sequence as represented in SEQ 
ID NO (NE92) 143 in the region spanning positions 379 to 957 of the Core/El region as 
shown in Figure 4; 

- an HCV genomic sequence as having a homology of more than 74%, preferably more t±ian 
76%, most preferably more than 78% homology to die sequence as represented in SEQ 
ID NO 143 (NE92) in the region spanning positions 574 to 957 as shown in Figure 4; 
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- an HCV geaomic sequence as liaving a homology of more than 87 % , preferably more than 
89%, most preferably more than 91% homology to the sequence as represented in SEQ 
ID NO 145 (NE92) in the region spanning positions 7932 to 8271 of the NS5B region as 
shown in Figure 1. 

Preferentially the above-mentioned geaomic HCV sequences depict sequences from the 
coding regions of all the above-mentioned sequences. 

According to the nucleotide distance classification system (with said nucleotide distances 
being calculated as explained above), said sequences of said composition are selected from: 

- a nucleotide distance of less dian 0.32, preferably less than 0.3 1 , more preferably less than 
0.30 for the El region (574 to 957) to any of the above specified sequences;- 

- a nucleotide distance of less than 0.08, preferably less than 0.07, more preferably less than 
0.06 for the Core region (I to 378) to any of the above given sequences 

- a nucleotide distance of less than 0.15, preferantially less than 0.13, more preferentially 
less than 0.12 for the NS5B region to any of the above-specified sequences. 
Polynucleic acid sequences according 4o die present invention which are homologous to the 

sequences as represented by a SEQ ID NO can be characterized and isolated according to any 
of the techniques known in the an, such as amplificatioa by means of type or subtype specific 
primers, hybridization with type or subtype specific probes under more or less stringent 
conditions, serological screening methods (see examples 4 and 11) or via the LiPA taping 
system. 

Polynucleic acid sequences of the genomes indicated above from regions not yet depicted 
in the present examples, figures and sequence listing can be obtained by any of the techniques 
known in the art, such as amplification techniques using suitable primers from the t^pe or 
subtype specific sequences of the present invention. 

The present invention relates also to a composition as defmed above, wherein said 
polynucleic acid is liable to act as a primer for amplifying the nucleic acid of a certain isolate 
belonging to the genotype from which the primer is derived. 

An example of a primer according to this embodiment of the invention is HCPr 152 as 
shown in table 7 (SEQ ID NO 79). 

The term "primer" refers to a single stranded DNA oligonucleotide sequence capable of 
acting as a point of initiation for synthesis of a primer extension produce which is 
complementary to the nucleic acid strand to be copied. Tne length and the sequence of the 
primer must be such that they allow to prime the synthesis of the extension products. 
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Preferably the primer is about 5-50 aucleotidcs. Specific length and sequence will depend on 
the complexity of the required DNA or RNA targets, as well as on the conditions of primer 
use such as temperature and ionic strength. 

The fact that amplificacion primers do not have to match exacdy with corresponding 
template sequence to warrant proper amplification is amply documented in the literamre 
(Kwok etaL, 1990). 

The amplification method used can be either polymerase chain reaction (PGR; Saiki et al. , 
1988), ligase chain reaction (LCR; Landgren et al., 1988; Wu & Wallace, 1989; Barany. 
1991), nucleic acid sequence-based amplification (NASBA; Guatelli et al., 1990; Compcoa, 
1991), transcription-based amplification system (TAS; Kwoh et ai., 1989), strand 
displacement amplirlcation (SDA; Duck, 1990; Walker et al., 1992) or toplification by 
means of Qft replicase (Lizardi et al., 1988; Ijomeli et al., 1989) or any other suitable method 
to amplify nucleic acid molecules using primer extension. During amplitlcation, the amplirled 
products can be coaveaieady labelled either using labelled primers or by incorporating 
labelled nucleotides. Labels may be isotopic {^"P, ^"^S, etc.) or non-isotopic (biotin, 
digoxigenin, etc.). Tae amplification reaction is repeated between 20 and 80 times, 
advantageously between 30 and 50 times. 

The present invention also relates to a composition as defmed above, wherein said 
polynucleic acid is able to act as a hybridization probe for specific detection and/or 
classification Into rvpes of a nucleic acid containing said nucleotide sequence, with said 
oligonucleotide being possibly labelled or attached to a solid substrate. 

The term "probe" refers to single stranded sequence-specific oligonucleotides which have 
a sequence which is complementary to the target sequence of the HCV genotype{s) to be 
detected. 

Preferably, these probes are about 5 to 50 nucleotides long, more preferably from about 
10 to 25 nucleotides. 

The term *'solid support" can refer to any substrate to which an oligonucleotide probe can 
be coupled, provided that it retains its hybridization characteristics and provided that the 
background level of hybridization remains low. Usually the solid substrate will be a microtiter 
plate, a membrane (e.g. nylon or nitrocellulose) or a microsphere O^ead). Prior to application 
to the membrane or fixation it may be convenient to modify the nucleic acid probe in order 
to facilitate fLxatloQ or improve the hybridization efficiency. Such modifications may 
encompass horaopolymer tailing, coupling with different reactive groups such as aliphatic 
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groups, NHj groups, SH groups, carboxylic groups, or coupling with bio tin or haptens. 

The present invention also relates to the use of a composition as defined above for 
detecting the presence of one or more HCV genotypes, more panicularly for detecting the 
presence of a nucleic acid of any of the HCV genotypes having a nucleotide sequence as 
defmed above, present in a biological sample liable co contain them, comprising at least the 
following steps: 

(i) possibly extracting sample nucleic acid, 

(ii) possibly amplifying the nucleic acid with at least one of the primers as defmed 
above or any other HCV subtype 2d, HCV type 3. HCV type 4, HCV type 5 
or universal HCV primer, ' - 

(iii) hybrizing the nucleic acids of the biological sample, possibly under denatured 
conditions, and with said nucleic acids being possibly labelled during or after 
amplirlcacion, at appropriate conditions with one or more probes as detmed above, 
with said probes being preferably attached to a solid subsuate, 

(iv) washing at appropriate conditions, 

(v) detecting the hybrids formed, 

(vi) inferring the presence of one or more HCV genotypes present from the observed 
hybridization pattern. 

Preferably, this technique could be performed in the Core or NS5B regioE. 

The term "nucleic acid" can also be referred to as analyte strand and corresponds to a 
single- or double-stranded nucleic acid molecule. This analyte snrand is preferentially positive- 
or negative stranded RNA, cDNA or amplified cDNA. 

The term "biological sample" refers to any biological sample (tissue or fluid) containing 
HCV nucleic acid sequences and refers more panicularly to blood serjm or plasma samples. 

The term "HCV subtype 2d primer" refers to a primer which specifically amplifies HCV 
subtype 2d sequences present in a sample (see Examples section and rlgures). 

The term "HCV type 3 primer" refers to a primer which specirlcally amplirles HCV type 

3 sequences present in a sample (see Examples section and figures). 

The term "HCV type 4 prhner" refers to a primer which specirlcally amplifies HCV type 

4 genomes present in a sample. 

The term "universal HCV primer" refers to oligonucleotide sequences complementary to 
any of the conserved regions of the HCV genome. 

The term "HCV type 5 primer" refers to a primer which specirlcally amplifies HCV ty^je 
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5 genomes present in a sample. The term "universal HCV primer" refers to oligonucleotide 
sequences complementary to any of the conserved regions of the HCV genome. 

The expression "appropriate" hybridization and washing conditions are to be understood 
as stringent and are generally known in die an (e.g. Maniatis e: al., Moleoilar Cloning: A 
Laboratory Manual, New York, Cold Spring Harbor Laboratory, 1982). 

However, according to the hybridization solution (SSC, SSPE, etc.), these probes should 
be hybridized at their appropriate lemperamre in order to attain sufficient specificity. 

The term "labelled" refers to the use of labelled nucieic acids. This may include the use 
of labelled nucleotides incorporated during the polymerase step of the amplification such as 
illustrated by Saiki et al. (1988) or Bej et al. (1990) ot labelled primers, or-by any other 
method known to the person skilled in the art. 

The process of the invention comprises the steps of contacting any of the probes as defined 
above, with one of the following elements: 

either a biological sample in which the nucieic acids are made available for 
hybridization, 

or the purified nucleic acids contained in the biological sample 
or a single copy derived from the purified nucleic acids, 

or an amplified copy derived from the purified nucleic acids, with said elements or 

with said probes being attached to a solid substrate. 
The expression "inferring the presence of one or more HCV genotypes present from the 
observed hybridization pattern"' refers to the identification of the presence of HCV genomes 
in the sample by analyzing the pattern of binding of a panel of oligonucleotide probes. Single 
probes may provide useful information concerning the presence or absence of HCV genomes 
In a sample. On the other hand, the variation of the HCV genomes is dispersed in nature, so 
rarely is any one probe able to identify uniquely a specific HCV genome. Rather, the identity 
of an HCV genotype may be inferred from the pattern of binding of a panel of 
oligonucleotide probes, which are specific for (different) segments of the different HCV 
genomes. Depending on the choice of these oligonucleotide probes, each known HCV 
genotype will correspond to a specific hybridization panern upon use of a speciric 
combination of probes. Each HCV genotype will also be able to be discriminated from any 
other HCV genotype amplified with the same primers depending on the choice of the 
oligonucleotide probes. Comparison of the generated pattern of positively hybridizing probes 
for a sample containing one or more unkown HCV sequences to a scheme of expected 
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hybridization patterns, allows one to clearly infer the HCV genotypes present in said sample. 

The present invention thus relates to a method as defined above, wherein one or more 
hybridization probes are selected from any of SEQ ID NO U 3. 5, 7, 9, 11, 13, 15, 17, 19, 
21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59 or 61, 106, 
108, 110, 112, 114, 116, 118, 120, 122, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 
163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 198, 191, 193, 195, 19", 
199 , 201, 203 , 205 , 207 , 209 , 211, 213 , 215 , 217 , 222, 269 or sequence variants thereof, 
wi± said sequence variants containing deletions and/or insertions of one or more nucleotides, 
mainly at their extremities (either 3' or 5'), or substimtioos of some non-essential nucleotides 
(i.e. nucleotides not essential to discriminate between genotypes) by others (including 
modified nucleotides or mosine), or with said variants consisting of die complement of any 
of the above-mentioned oligonucleotide probes, or with said variants consisting of 
ribonucleotides instead of deoxyribonucleotides, all provided that said variant probes can be 
caused to hybridize widi the same specificity as the oligonucleotide probes from which they 
are derived. 

In order to distinguish the amplified HCV genomes from each other, the target polynucleic 
acids are hybridized to a set of sequence-specific DNA probes targeaing HCV genocypic 
regions located in the HCV polynucleic acids. 

Most of these probes target the most type-specific regions of HCV genotypes, but some 
can be caused to hybridize to more than one HCV genotype. 

According to die hybridization solution (SSC, SSPE, etc,), these probes should be 
stringently hybridized at their appropriate temperature in order to attain sufficient specificity. 
However, by slighdy modifying the DNA probes, either by adding or deleting one or a few 
nucleotides at dieir extremities (either 3* or 5'), or substimting some non-essential nucleotides 
(i.e. nucleotides not essential to discriminate between types) by others (including modified 
nucleotides or inosine) these probes or variants thereof can be caused to hybridize specifically 
at the same hybridization conditions (i.e. the same temperature and the same hybridization 
solution). Also changing the amount (concentration) of probe used may be beneficial to obtain 
more specific hybridization results. It should be noted in this context, that probes of the same 
length, regardless of their GC content, will hybridize specifically at approximately the same 
temperature in TMACl solutions (Jacobs et al., 1988). 

Suitable assay methods for purposes of the present invention to detect hybrids formed 
between the oligonucleotide probes and the nucleic acid sequences in a sample may comprise 
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any of the assay formats known in the an, such as the conventional dot-blot format, 
sandwich hybridization or reverse hybridization. For example, the detection can be 
accomplished using a dot blot format, the uniabelled amplified sample being bound to a 
membrane, the membrane being incorporated with at least one labelled probe under suitable 
hybridization and wash conditions, and the presence of bound probe being monitored. 

An alternative and preferred method is a "reverse" dot-blot format, in which the amplified 
sequence contains a label. In this format, the uniabelled oligonucleotide probes are bound to 
a solid support and exposed to the labelled sample under appropriate stringent hybridization 
and subsequent washing conditions. It is to be understood that also any other assay mediod 
which relies on the formation of a hybrid between the nucleic acids of the sample and die 
oligonucleotide probes according to the present invention may be used. 

According to an advantageous embodiment, the process of detecting one or more HCV 
genotypes contained in a biological sample comprises the steps of contacting amplified HCV 
nucleic acid copies derived from the biological sample, with oligonucleotide probes which 
have been immobilized as parallel lines on a solid support. 

According to this advantageous method, the probes are immobilized in a Line Probe Assay 
(LiPA) format. This is a reverse hybridization format (Saiki et al., 1989) using membrane 
strips onto which several oligonucleotide probes (including negative or positive control 
oligonucleotides) can be conveniendy applied as parallel lines. 

The invention thus also relates to a solid suppon, preferably a membrane strip, carry'mg 
on its surface, one or more probes as detlned above, coupled to the support in the form of 
parallel lines. 

The LiPA is a very rapid and user-friendly hybridization test. Results can be read 4 h. 
after the start of the amplification. After amplification during which usually a non-isotopic 
label is Incorporated in the amplitled product, and alkaline denaturation, the amplified product 
is contacted with the probes on the membrane and die hybridization is carried out for about 
1 to 1,5 h hybridized polynucleic acid is detected. From the hybridization panern generated, 
the HCV type can be deduced either visually, but preferably using dedicated software. The 
LiPA format is com^pletely compatible with commercially available scanning devices, thus 
rendering automatic interpretation of the results very reliable. All diose advantages make the 
LiPA format liable for the use of HCV detection in a routine setting. The LiPA format should 
be panicularly advantageous for detecting the presence of different HCV genotypes. 

The present invention also relates to a method for detecting and identifying novel HCV 
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genotypes, different from the known HCV genomes, cxunprising the sieps of: 

determming to which HCV genotype the nucleotides present in a biological sample 
belong, according to the process as defined above, 

in the case of observing a sample which does not generate a hybridization pattern 
compatible with those defmed in Table 3, sequencing the portion of the HCV 
genome sequence corresponding to the aberrantly hybridizing probe of the new 
HCV genotype to be determined. 
The present invention also relates to the use of a composition as defmed above, for 

detecting one or more genotypes of HCV present in a biological sample liable to contain 

them, comprising the steps of: 

(i) possibly extracting sample nucleic acid, 

(ii) amplifying the nucleic acid with at leas: one of the primers as defmed above, 

(iii) sequencing the amplified produce 

(iv) inferring the HCV genotypes present from che determined sequences by comparison 
to all Icnown HCV sequences. 

The present invention also relates to a composition consistmg of or comprising at least one 
peptide or polypeptide comprising .a contiguous sequence of at lease 5 amino acids 
corresponding to a contiguous amino acid sequence encoded by at least one of the HCV 
genomic sequences as defmed above, having at least one amino acid differing from the 
corresponding region of known HCV (type I and/or type 2 andr'or type 3) polyprotein 
sequences as shown in Table 3, or muteins thereof 

It is to be noted thai, at the level of the amino acid sequence, an amino acid difference 
(with respect to known HCV amino acid sequences) is necessary, which means that the 
polypeptides of the invention correspond to polynucleic acids having a nucleotide difference 
(with known HCV polynucleic acid sequences) involving an amiao acid difference. 

The new amino acid sequences, as deduced from the disclosed nucleotide sequences (see 
SEQ ID NO 1 to 62 and 106 to 123 and 143 to 21S, 223 and 270), show homologies of only 
59.9 to 78% with prototype sequences of type I and 2 for the NS4 region, and of only 53.9 
to 68.8% with prototype sequences of type 1 and 2 for the El region. As the NS4 region is 
known to contain several epitopes, for example characterized in patent application EP-A-0 
489 968, and as the El proteiQ is expected to be subject to immune attack as part of the viral 
envelope and expected to contain epitopes, the NS4 and El epitopes of the new t>pe 3. 4 and 
5 isolates will consistendy differ from the epitopes present in type I and 2 isolates. This is 
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examplified by the type-specificity of NS4 synthetic peptides as presented in example 4, and 
the type-specificity of recombinant El profeins in example 11. 

After aligning the new subtype 2d, type 3, 4 aikl 5 (see SEQ ID NO 1 to 62 and 106 to 
123 and 143 to 218, 223 and 270) amino acid sequences with the prototype sequences of type 
la, lb, 2a, and 2b, type- and subtype-specific variable regions can be delineated as presented 
in Figure 5 and 7. 

As to the muteins derived from die polypeptides of the invention. Table 4 gives an 
overview of the amino acid substimtions whidi could be the basis of some of the muteins as 
defined above. 

The peptides according to the present invention contain preferably at least 5 contiguous 
HCV amino acids, preferably however at least 8 contiguous amino acids, at least 10 or at 
least 15 (for instance at least 9, II, 12, 13. 14, 20 or 25 amino acids) of the new HCV 
sequences of the invention. 
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TABLE 4 



Amino acids 


Synonymous groups 


Ser (S) 


Ser, Thf , Giy, Asn 


Arg (R) 


Arg, His, Lys, Glu, Gin 


Leu (L) 


Leu; lie. Met, Phe, Val, Tyr 


Pro (P) 


Pro, Ala, Tar, Gly 


Thr (T) 


Thr, Pro, Ser, Ala, Gly, His, Gin 


Ala (A) 


Ala, Pro, Gly, Thr 


Val (V) 


Val, Met, He, Tyr, Phe, Leu, Val 


Gly (G) 


Gly, Ala, Thr, Pro, Ser 


He (I) 


lie, Mel, Leu. Phe, Val, He, Tyr - _ 


Phe (F) 


Phe, Met, Tyr, He, Leu, Trp, Val 


Tyr (Y) 


Tyr, Phe, Trp, Met, He, Val, Leu 


Cys (C) 


Cys, Ser, Thr, Met 


His (H) 


His, Gin, Arg, Lys, Glu, Thr 


Gin (Q) 


Gin, Glu, His, Lys, Asn, Thr, Arg 


Asu (N) 


Asn, Asp, Ser, Gin 


Lys (K) 


Lys, Arg, Glu, Gin, His 


A^p (D) 


Asp, Asn, Giu, Gin 


Glu (E) 


Glu, Gin, Asp, Lys, Asn, His. Arg 


Met (M) 


Met, He, Leu, Phe, Val 



The polypeptides of the invennon, and particularly the fragments, can be prepared by 
classical chemical synthesis. 

The synthesis can be carried out in homogeneous solution or in solid phase. 

For instance, the synthesis technique in homogeneous solution which can be used is the one 
described by Houbenweyl in the book entided "Methode der organischen chemie" (Method 
of organic chemistry) edited by E. Wunsh, vol. 15-1 et IL THIEME, Stuttgart 1974. 

The polypeptides of the invention can also be prepared in solid phase according to the 
methods described by Atherton and Shepard in their book entided "Solid phase peptide 
syndiesis" (IRL Press, Oxford, 1989). 

The polypeptides according to this invention can be prepared by means of recombinant 
DNA techniques as described by Maniatis et al., Molecular Cloning: A Laboratory Manual, 
New York, Cold Spring Harbor Laboratory, 1982). 

The present invention relates particularly to a polypeptide or peptide composition as 
defmed above, wherein said contiguous sequence contains in its sequence at least one of the 
following amino acid residues: 
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L7. Q43. M44, S60, R67. Q70, T71, A79. A87. N106, K115, A127, A190. S130, V134, 
G142. 1144. E152. A157, V158, P165, S177 or Y177. 1178, V180 or Ei80 or F182, R184, 
1186, H187. T189, A190, S191 or G191. Q192 or L192 or 1192 or V192 or E192, N193 or 
H193 or P193, W194 or Y194, H195. A197 or 1197 or V197 or T197. V202. 1203 or L203, 
Q208, A210, V212. F214, T216, RZl? or D217 or E217 or V217, H218 or N218, H219 or 
V219 or L219, L227 or 1227. M231 or E231 or Q231, T232 or D232 or A232 or K232, 
Q235 or 1235, A237 or T237, 1242, 1246, S247, S248, V249. S250 or Y250, 1251 or V251 
or M251 or F251, D252. T254 or V254, L255 or V255. E256 or A256, M258 or F258 or 
V258, A260 or Q260 or S260, A261, T264 or Y264, M265, 1266 or A266, a:67, G268 or 
T268, F271 or M271 or V271, 1277, M280 or H280, 1284 or A284 or L84^ V274, V291, 
N292 or S292, R293 or 1293 or Y293, Q294 or R294, L297 or 1297 or Q297. A299 or K299 
or Q299. N303 or T303, T308 or L308, T310 or F3I0 or A310 or D310 or V310. L313, 
0317 or Q317, L333, S351, A358, A359, A363, S364. A366. T369, L373. F376, Q386, 
1387, S392, 1399, F402. 1403, R405, D454, A461, A463, T464. K484, Q500, E501, S521, 
K522, H524, N528, S531, S532, V534, F536, F537. M539, 1546, C1282. A1283, H1310, 
V13I2, Q1321, P1368, V1372, V1373, K1405, Q1406, S1409, A1424, A1429, C1435, 
51436, S1456, H1496, A1504, D1510, D1529, 11543, N1567, D1556, N1567, M1572, 
Q1579, L1581. S1583, F1585, V1595, E1606 or T1606, M161I, V1612or L1612, P1630, 
C1636. P1651, T1656 or 11656, L1663, V1667, V1677, A1681, H1685, E1687. G1689, 
V1695, AI700. Q1704. Y1705, A1713, AL714 or S1714, M1718, D1719, A1721 orT1721, 
R1722, A1723 or V1723, HI726 or G1726, E1730, V1732, F1735, 11736. S1737. R1738, 
T1739, G1740, Q1741, K1742, Q1743, A1744, T1745, L1746, E1747 or K1747, 11749, 
A1750. T1751 or A1751, V1753, N1755, K1756, A1757, P1758, AI759, H1762, T1763, 
Y1764, P2645, A2647, K2650, K2653 or L2653, S2664, N2673, F2680. K2681, L2686, 
H2692, Q2695 or L2695 or 12695, V2712, F2715, V2719 or Q2719, T2722, T2724, S2725, 
R2726, G2729, Y2735, H2739, 12748, G2746 or 12746, 12748, P2752 or K2752, P2754 or 
T2754, T2757 or P2757, 

with said notacioa being composed of a letter representing the amino acid residue by its one- 
letter code, and a number representing the amino acid numbering according to Kato et al., 
1990 as shown in Table 1 (comparison with other isolates). See also the numbering in Figures 
2, 5, 7, and 11 (alignment amino acid sequences). 

Within the group of unique and new amino acid residues of the present invention, the 
following residues were found to be specific for the following types of HCV' according to the 
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HCV classification system used in the present invention: 

Q208, R217, E231, 1235. 1246, T264, 1266. A267, F271, K299, L2686. Q2719 
which are specific for the HCV subtype 2d sequences of the present invention as 
shown in Fig. 5 and 2; 

Q43, S60, R67, F182. 1186, H187. A190, S191, L192, W194, V202, L203, V219, 
Q231. D232, A237, T254, M280. Q299, T303, U08, and/or L313 which are 
specific for the Core/El region of HCV type 3 of the invention as shown in Fig. 
5; 

D1556, Q1579. L1581. S1584, F1585, E1606, V1612, P1630, C1636. T1656. 
L1663, H1685, Ei687, G1689, V1695. Y1705, A.1713. A1714, A1721, V1723, 
H1726, R1738, Q1743, A1744. E1747, 11749, A1751, A1759 and/or H 1762 which 
are specific for the NS3/4 region of HCV type 3 sequences of the invention as 
shown in Fig. 7; 

K2665, D2666, R2670 which are specific for the NS5B region of HCV type 3 of 
the invention as shown in Fig. 2; 

L7, A79. A127, S130, E152, V158, Si77 or Y177, V180 or E180, R184, T189, 
Q192 or E192 or 1192, N193 or H193, 1197 or Vi97, 1203, A210, V212, E217, 
H218, H219. L227, A232, V249, 1251 or M251, D252, L255 or V255, E256, 
M258 or V258 or F258, A260 or Q260, M265, T268. V271, V274, M280, 1284, 
N292 or S292, Q294, L297 or 1297, T308. A310 or D310 or V310 or T310, and 
G317 which are specific for the core/El region of HCV type 4 sequences of the 
present invention as shown in Fig. 5; 

P2645, K2650, K2653, G2656, V2658, T2668. N2673 or N2673, K2681, H2686, 
D2691, L2692. Q2695 orL2695 or 12695, Y2704, V2712, F2715, V27I9. 12722, 
S2725, G2729, Y2735, G2746 or 12746, P2752 or K2752, Q2753. P2754 or 
T2754, T2757 or P2757 which are specific for the NS5B region of the HCV type 
4 sequences of the present invention as shown m Fig. 2; 

M44, Q70, A87. N106, K115, V137, GU2. P165, 1178. F251, A299. N303, Q317 
which are specific for the Core/El region of the HCV type 4 sequences of the 
present invention as shown in Fig. 5; 

L333, S351, A358. A359. A363. S364, A366, T369. L373, F376. Q386, 1387. 
S392. 1399. F102, 1403, R405, D454, A461. A463, T464, K484, Q500, E501, 
S521. K522, H524, N528, S532. V534. F537, M539, 1546 which are specific for 
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the EI/E2 region of the HCV type 5 sequences of the present inventioa as shown 
in Fig. 12; 

C1282, A1283, V1312, Q132U P1368, V1372, K1405, Q1406, S1409, A1424, 
A1429, C1435, S1436. S1456, H1496, A1504, D1510, D1529. 11543, N1567. 
M1572, V1595, T1606, M1611, L1612, 11656, V1667, A1681, A1700, A1713, 
S1714, M1718, D1719, T1721, R1722, A1723, G1726, F1735, 11736, S1737, 
T1739, G1740, K1742, T1745, L1746, K1747, AI750, V1753, N1755, A1757, 
D1758, T1763, and Y1764 which are specific for the NS3/NS4 region of HCV 
type 5 sequences of the invention as shown in Fig. 7; 

A2647, L2653, S2674, F2680, T2724, R2726, Y2730, H2739 which'are specific 
for the NS5B region of the HCV type 5 sequences of the present inventioa as 
shown in Fig. 2; 

A256, PI631, V1677, Q1704, E1730, VI732, Q1741 andT1751 which are specific 
for the HCV type 3 and 5 sequences of the present invention as shown in Fig. 5 
and 7; 

T71, A157, 1227, T237, T240, Y250, V25U S260, ^G7U T2673, T2722, 12748 
which are specific for the HCV type 3 and 4 sequences of the present invention as 
shown in Fig. 5 and 2, 

V192, Y194. A197, F249, S250, R294 which are specific for the HCV type 4 and . 
5 sequences of the present invention as shown in Fig. 5; 

1293 which is specific for the HCV type 4 and subtype 2d sequence of the present 
invention as shown in Fig. 5; 

D217 and R294 which are specific for the HCV type 3, 4 and 5 sequences of the 
present invention as shown in Fig. 5; 

L192 which is specific for the HCV type 3 and subtype 2d sequences of the present 
invention as shown in Fig. 5; 

G191 and T197 which are specific for die HCV type 3, 4 and subtype 2d sequences 
of the present invention as shown in Fig. 5; 

K232 which is specific for the HCV subtype 2d en type 5 sequences of die present 

invention as shown in Fig. 5. 
and with said notation being composed of a letter, unambiguously representing the amino acid 
by its one-letter code, and a number representing the amino acid numbering according to Kato 
et al., 1990 (see also Table 1 for comparison with other isolates), as well as Figure 2 (NS5 
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region), Figure 5 (Core/El region). Figure 7 (NS3/NS4 region). Figure 12 (E1/E2 region). 
Some of the above-meatioaed amino acids may be contained in type or subtype specific 
epitopes. 

For example M23 1 (detected in type 5) refers to a methionine at position 23 L A glutamine 
(Q) is present at the same position 231 in type 3 isolates, whereas this position is occupied 
by an arginine in type 1 isolates and by a lysine (K) or asparagine (N) in type 2 isolates (see 
Figure 5). 

The peptide or polypeptide according to ±is embodiment of the invention may be possibly 
labelled, or attached to a solid substrate, or coupled to a carrier molecule such as biotin, or 
mixed with a proper adjuvant. 

The variable region in the core protein (V-CORE in Fig. 5) has been shown to be useful 
for serotyping (Machlda et al., 1992), The sequence of the disclosed type 5 sequence in this 
region shows type-specitic features. The peptide from amino acid 70 to 78 shows the 
following unique sequence for the sequences of the present inevntion (see figure 5): 

QPTGRSWGQ (SEQ ID NO 93) 

RSEGRTSWAQ (SEQ ID NO 220) 

and RTEGRTSWAQ (SEQ ID NO 221) 
Another preferred V-Core spanning region is die peptide spanning positions 60 to 78 of 
subtype 3c with sequence: 

SRRQPIPR.\RRTEGRSWAQ (SEQ ID NO 268) 

Five type-specific variable regions (VI to V5) can be identirled after aligning El amino 
acid sequences of the 4 genotypes, as shown in Figure 5. 

Region VI encompasses amino acids 192 to 203, this is the amino- terminal 10 amino acids 
of the El protein. The following unique sequences as shown in Fig. 5 can be deduced: 

LEWRNTSGLYVL (SEQ ID NO 83) 

VNYRNASGIYHI (SEQ ID NO 126) 

QHYRNISGIYHV (SEQ ID NO 127) 

EHYRNASGIYHI (SEQ ID NO 128) 

IHYRNASGIYHI (SEQ ID NO 224) 

VTYRNASGIYHV (SEQ ID NO 84) 

VNYRNASGIYHI (SEQ ID NO 225) 

VNYRNASGVYHI (SEQ ID NO 226) 

VNYHNTSGIYHL (SEQ ID NO 227) 
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QHYRNASGIYHV (SEQ ID NO 228) 

QHYRNVSGIYHV (SEQ ID NO 229) 

IHYRNASDGYYI (SEQ ID NO 230) 

LQVKNTSSSYMV (SEQ ID NO 231) 
Region V2 encompasses amino acids 213 to 223. The following unique sequences can be 
found in tlie V2 region as shown in Figure 3: 
VYEADDVILHT (SEQ ID NO 85) 
VYETEHHILHL (SEQ ID NO 129) 
VYEADHHIMHL (SEQ ID NO 130) 
VYETDHHILHL (SEQ ID NO 131) 
VYEADNLILHA (SEQ ID NO 86) 
VWQLRAIVLHV (SEQ ID NO 232) 
VYEADYHILHL (SEQ ID NO 233) 
VYETDNHILHL (SEQ ID NO 234) 
VYETENHILHL (SEQ ID NO 235) 
VFETVHHILHL (SEQ ID NO 236) 
VFETEHHILHL (SEQ ID NO 237) 
VFETDHHIMHL (SEQ ID NO 238) 
VYETENHILHL (SEQ ID NO 239) 
VYEADALILHA (SEQ ID NO 240) 

Region V3 encompasses the amino acids 230 to 242. The following unique V3 region 
sequences can be deduced from Figure 5: 
VQDGNTSTCWTPV (SEQ ID NO 87) 
VQDGNTSACWTPV (SEQ ID NO 241) 
VRVGNQSRCWVAL (SEQ ID NO 132) 
VRTGNTSRCWVPL (SEQ ID NO 133) 
VRAGNVSRCWTPV (SEQ ID NO 134) 
EEKGNISRCWIPV (SEQ ID NO 242) 
VKTGNQSRCWVAL (SEQ ID NO 243) 
VRTGNQSRCWVAL (SEQ ID NO 244) 
VKTGNQSRCWIAL (SEQ ID NO 245) 
VKTGNVSRCVvqPL (SEQ ID NO 247) 
VKTGNVSRCWISL (SEQ ID NO 248) 
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VRKDNVSRCWVQI (SEQ ID NO 249) 

Region V4 encompasses the amino acids 248 to 257. The following unique V4 region 
sequences can be deduced from fig'jre 5: 

VRYVGATTAS (SEQ ID NO 89) 

APYIGAx^LES (SEQ ID NO 135) 

APY^/GAPLES (SEQ ID NO 136) 

AVSMDAPLES (SEQ ID NO 137) 

APSLGAVTAP (SEQ ID NO 90) 

APSFGAVTAP (SEQ ID NO 250) 

VSQPGALTKG (SEQ ID NO 251) 

VKYVGATTAS (SEQ ID NO 252) 

APYIGAPVES (SEQ ID NO 253) 

AQHLNAPLES (SEQ ID NO 254) 

SPYVGAPLEP (SEQ ID NO 255) 

SPYAGAPLEP (SEQ ID NO 256) 

APYLGAPLEP (SEQ ID NO 257) 

APYLGAPLES (SEQ ID NO 258) 

APY\'GAPLES (SEQ ID NO 259) 

VPYLGAPLTS (SEQ ID NO 260) 

APHLRAPLSS (SEQ ID NO 261) 

APYLGAPLTS (SEQ ID NO 262) 
Region V5 encompasses the amino acids 294 to 303. The following unique V5 region 
peptides can be deduced from figure 5: 

RPRRHQTV'QT (SEQ ID NO 91) 

QPRRHWTTQD (SEQ ID NO 138) 

RPRRHWTTQD (SEQ ID NO 139) 

RPRQHATVQN (SEQ ID NO 92) 

RPRQHATV'QD (SEQ ID NO 263) 

SPQHHKFVQD (SEQ ID NO 264) 

RPRRLWTTQE (SEQ ID NO 265) 

PPRIHETTQD (SEQ ID NO 266) 

The variable region in the E2 region (HVR-2) of type 5a as shown in Figure 12 spanning 
amino acid positions 471 to 484 is also a preferred peptide according to the present invention 
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with the foilowing sequence: 

TISYANGSGPSDDK: (SEQ id no 267) 

The above given list of peptides are panicularly suitable for vaccine and diagnostic 
development. 

Also comprised in the present invention is any synthetic peptide or polypeptide containing 
at least 5 contiguous amino acids derived from the above-defmed peptides in their peptidic 
chain. 

According to a specific embodiment, the present invention relates to a composition as 
defmed above, wherein said contiguous sequence is selected from any of the following HCV 
amino acid type 3 sequences: 

- a sequence having a homology of more than 12%, preferably more than 74%, more 
preferably more than 77% and most preferably more than 80 or 84% homology to any of 
the amino acid sequences as represented in SEQ ID NO 14, 16. 18, 20, 22, 24, 26 or 28 
(HDIO, BR36, BR33 sequences) in the region spanning positions 140 to 319 in the 
Core/El region as shown in Figure 5; 

- a sequence having a homology of more than 70%, preferably more than 12%, more 
preferably more than 75% homology, most preferably more than 81% homology to any 
of the amino acid sequences as represented in SEQ ID NO 14, 16. 18, 20, 22, 24, 26 or 
28 (HDIO, BR36, BR33 sequences) in the El region spanning positions 192 to 319 as 
shown in Figure 5; 

- a sequence having a homology of more than 86%, preferably more than 88%, and most 
preferably more than 90% homology to the amino acid sequences as represented in SEQ 
ID NO 148 (type 3c); BE98 in the region spanning positions 1 to 110 in the Core region 
as shown in Figure 5; 

- a sequence having a homology of more than 76%, preferably more than 78%, most 
preferably more than 80% to any of the amino acid sequences as represented in SEQ ID 
NO 30, 32, 34, 36, 38 or 40 (HCC153, HDiO. BR36 sequences) in" the region spanning 
positions 1646 to 1764 Ln the NS3/NS4 region as shown in Figure 7 and 11; 

- a sequence having a homology of more than 81%, preferably more than 83%, and most 
preferably more than 86% homology to any of the amino acid sequences as represented 
in SEQ ID NO 14, 16, 18, 20, 22, 24, 26 or 28 (HDIO, BR36, BR33 sequences) in the 
region spanning positions 140 to 319 in the Core/El region as shown in Figure 5; 

- a sequence having a homology of more than 81.5 % , preferably more than 83% , and most 
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preferably more than 86% homology to any of the amino acid sequences as represented 
in SEQ ID NO 14, 16, 18, 20, 22, 24, 26 or 28 (HDIO, BR36, BR33 sequences) in the 
El region spanning positions 192 to 319 as shown in Figure 5; 

- a sequence having a homology of more than 86%, preferably more than 88%, most 
preferably more than 90% to the amino acid sequence as represented in SEQ ID NO 150; 
(type 3c BE98) in the region spanning positions 2645 to 2757 in the NS5B region as shown 
in Figure 2. 

According to yet another embodiment, the present inveauon relates to a composition as 
defmed above, wherein said contiguous sequence is selected from any of the following HCV 
amino acid type 4 sequences: 

- a sequence having a homology of more than 80%, preferably more than 82%. mosi 
preferably more than 84% homology to any of the amino acid sequences as represented 
in SEQ ID NO 118, 120, and 122 (GB358, GB549, GB809 sequences) m the region 
spanning positions 127 to 319 of the Core/El region as shown in Figure 5; 

- a sequence having a homology of more than 73%, preferably more than 75%. most 
preferably more than 78 % homology in the El region spanning positions 192 to 319 to any 
of the amino acid sequences as represented in SEQ ID NO 118, 120, and 122 (GB358, 
GB549, GB809 sequences) in the region spanning positions 140 to 319 of die Core/El 
region as shown in Figure 5; 

- a sequence having more than 85%, preferably more than 86%, most preferably more than 
87% homology to any of the amino acid sequences as represented in SEQ ID NO 118, 120 
or 122 (GB358, GB549, GB809 sequences) in the region spanning positions 192 to 319 of 
EI as shown in Figure 5; 

- a sequence showing more dian 73 % , preferably more dian 74 % , most preferably more than 
75% homology to any of the amino acid sequences as represented in SEQ ID NO 106, 
108, 110. 112, 114 or 116 (GB48, GB116, GB2I5, GB358, GB549, GB809 sequences) 
in the region spanning positions 2645 to 2757 of die NS5B region as shown in Figure 2; 

- a sequence having any of die sequences as represented in SEQ ID NO 164 or 166 (GB809 
and CAM600 sequences) in the Core/El region as shown in Figure 5; 

- a sequence having any of the sequences as represented in SEQ ID NO 168, 170, 172, 174, 
176, 178. 180, 182, 184. 186. 188 or 190 (CAM600. GB809, CAMG22, CAMG27, 
GB549, GB438, CAR4/1205, CAR4/901. GB116, GB215, GB958, GB809-4 sequences) 
in the Ei region as shown in Figure 5; 
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- a sequence having any of the sequences as represented in SEQ ID NO 192, 194, 196, 198, 
200, 202, 204, 206, 208, 210, 212 (GB358, GB724, BEIOO, PC, CAM600, CAMG22, 
etc.) in the NS5B region. 

The above-mentioned type 4 peptides polypeptides comprise at least an amino acid 
sequence selected from any HCV type 4 polyprotein with the exception of core sequence as 
disclosed by Simmonds et al. (1993, EG-29, see Figure 5). 

According to yet anodier aspect, the present invention relates to a composition as defmed 
above, wherein said contiguous sequence is selected from any of the following HCV amino 
acid type 5 sequences: 

- a sequence having more than 93%, preferably more than 94%, most preferably more than 
95% homology in ±e region spanning Core positions 1 to 191 to any of the amino acid 
sequences as represented in SEQ ID NO 42, 44, 46, 48, 50, 52 or 54 (PC sequences) and 
SEQ ID NO 152 (BE95) as shown in Fig^ore 5; 

a sequence having more than 73%, preferably more than 74?^, most preferably 
more than 76% homology in the region spanning El positions 192 to 319 to any 
of the amino acid sequences as represented in SEQ ID NO 42, 44, 46, 48, 50, 52 
or 54 (PC sequences) as shown in Figure 5 ; 

- a sequence having a more than 78%, preferably more than 80%, most preferably more 
than 83% homology to any of the amino acid sequences as represented in SEQ ID NO 42, 
44, 46, 48, 50, 52, 54, 15^, 156 (BE95, BElOO) (PC sequences) in the region spanning 
positions 1 to 319 of die Core/EI region as shown in Figure 5; 

- a sequence having more than 90% , preferably more than 91 % , most preferably more than 
92% homology to any of the amino acid sequences represented in SEQ ID NO 56 to 58 
(PC sequences) in the region spanning positions 1286 to 1403 of the NS3 region as shown 
in Figure 7 or 11; 

- a sequence having more than 66 % , more pardcularly 68 % , most particularly 70 % or more 
homology to any of the amino acid sequences as represented in SEQ ID NO 60 or 62 (PC 
sequences) in the region spanning positions 1646 to 1764 of the NS3/4 region as shown 
in Figure 7 or 11. 

According to yet another embodiment, the present invention relates to a 
composition as defmed above, wherein said contiguous sequence is selected from any of 
the following HCV amLno acid type 2d sequences: 

- a sequence having more than 83%, preferably more than 85%, most preferably more than 



SUBSTITUTE SHEET (RULE 25) 



tl- .cli ' . c i5, ^01^ S C J H ti^ £l O 
wo 94/25601 ' PCT/EP94/01323 

41 

87% homology to the amino acid sequence as represented in SEQ ID NO 144 (NE92) in 
the region spanning positions 1 to 319 of the Core/El region as shown in Figure 5; 

- a sequence having more than 79%, preferably more than 81 %, most preferably more than 
84% homology in the region spanning El positions 192 to 319 to the amino acid sequence 
as represented in SEQ ID NO 144 (NE92) as shown in Figure 12; 

- a sequence having more than 95 % , more particularly 96 % , most panicuiariy 97 % or more 
homology to the amino acid sequence as represented in SEQ ID NO 146 (NE92) in the 
region spanning positions 2645 to 2757 of the NS5B region as shown in Figure 2. 

The present invention also relates to a recombinant vector, particularly for cloning and/or 
expression, with said recombmant vector comprismg a vector sequence, an* appropriate 
prokaryotic, eukaryotic or viral promoter sequence followed by the nucleotide sequences as 
defmed above, with said recombmant vector allowing tiie expression of any one of the HCV 
type 2 and/or HCV type 3 and/or type 4 and/or type 5 derived polypeptides as defmed above . 
in a prokaryotic, or eukaryotic host or in living mammals when mjected as naked DNA, and 
more particularly a recombinant vector allowing the expression of any of the following HCV 
type 2d, type 3, type 4 or type 5 polypeptides spanning the following amino acid positions: 
a polypeptide starting at position 1 and ending at any position in the region between 
positions 70 and 326. more particularly a polypeptide spanning positions 1 to 70, 
1 to 85, positions I to 120, positions 1 to 150, positions 1 to 191, positions 1 to 
200, for expression of the Core protein, and a polypeptide spanning positions 1 to 
263, positions 1 to 326, for expression of the Core and El protein; 
a polypeptide starting at any position in the region between positions 117 and 192, 
and ending at any position in the region between positions 263 and 326, for 
expression of El, or forms that have the putative membrane anchor deleted 
(positions 264 to 293 plus or minus 8 amino acids); 

a polypeptide staning at any position in die region between positions 1556 and 
1688, and ending at any position in the region between positions 1739 and 1764, 
for expression of the NS4 regions, more particularly a polypeptide starting at 
position 1658 and ending at position 1711 for expression of the NS4a antigen, and 
more particularly, a polypeptide starting at position 1712 and ending between 
positions 1743 and 1972, for example 1712-1743, 1712-1764. 1712-1782, 1712- 
1972, 1712 to 1782 and 1902 to 1972 for expression of the NS4b protein or parts 
thereof. 
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The term ''vector* may comprise a plasmid, a cosmid, a phage, or a virus. 
In order to carry out the expression of the polypeptides of the invention *m bacteria such 
as E. coli or in eukar>'otic cells such as in S. cerevisiae, or in cultured vertebrate or 
invertebrate hosts such as insect ceils, Chinese Hamster Ovary (CHO), COS, BHK, and 
MDCK ceils, the following steps are carried out: 

transformation of an appropriate cellular host with a recombinant vector, in which 
a nucleotide sequence coding for one of the polypeptides of the invention has been 
inserted under the control of the appropriate regulatory elements, panicularly a 
promoter recognized by the polymerases of the cellular host and, in the case of a 
prokaryotic host, an appropriate ribosome bmding site (RBS), enabling the 
expression in said cellular host of said nucleotide sequence. In the case of an 
eukaryotic host any artificial signal sequence or pre/pro sequence might be 
provided, or the natural HCV signal sequence might be employed, e.g. for 
expression of El the signal sequence starting between amino acid positions 1 17 and 
170 and ending at amino acid position 191 can be used, for expression of NS4, the 
signal sequence starting between amino acid positions 1646 and 1659 can be used, 
culture of said transformed cellular host under conditions enabling the expression 
of said insert. 

The present invention also relates to a composition as defmed above, wherein said 
polypeptide is a recombinant polypeptide expressed by means of an expression vector as 
defmed above. 

The present invention also relates to a composition as defmed above, for use in a method 
for immunizing a mammal, preferably humans, against HCV comprising administring a 
sufficient amount of the composition possibly accompanied by pharmaceutically acceptable 
adjuvants, to produce an immune response, more particularly a vacc'me composition including 
HCV type 3 polypeptides derived from the Core, El or the NS4 region and/or HCV type 4 
and/or HCV type 5 polypeptides and/or HCV type 2d poljpeptides. 

The present invention also relates to an antibody raised upon immunization with a 
composition as defmed above by means of a process as defmed above, with said antibody 
being reactive with any of the polypeptides as defmed above, and with said antibody being 
preferably a monoclonal antibody. 

The monoclonal antibodies of the invention can be produced by any hybridoma liable 
to be formed according to classical methods from splenic cells of an animal, panicularly from 
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a mouse or rat, immunized against the HCV polypeptides according to the invention, or 
muteins thereof, or fragments thereof as defmed above on the one hand, and of cells of a 
myeloma cell line on the other hand, and to be selected by the ability of the hybridoma to 
produce the monoclonal antibodies recognizing the polypeptides which has been iniciaJly used 
for the immunization of the animals. 

The antibodies involved in the invention can be labelled by an appropriate label of the 
enzymatic, fluorescent, or radioactive type. 

The monoclonal antibodies according to this preferred embodiment of the invention may 
be humanized versions of mouse monoclonal antibodies made by means of recombmant DNA 
technology, depaning from pans of mouse and/or human genomic DNA sequences coding 
for H and L chains or from cDNA clones coding for H and L chams. 

Alternatively the monoclonal antibodies according to this preferred embodiment of the 
invention may be human monoclonal antibodies. These antibodies according to me present 
embodiment of the invention can also be derived from human peripheral blood lymphocytes 
of patients infected with type 3. type 4 or ^^-pe 5 HCV, or vaccinated against HCV. Such 
human monoclonal antibodies are prepared, for instance, by means of human peripheral blood 
lymphocytes (PBL) repopulation of severe combined immune detlciency (SCID) mice (for 
recent review, see Duchosal et al. 1992). 

The invention also relates to the use of the proteins of the invention, muteins thereof, or 
peptides derived therefrom for the selection of recombinant antibodies by the process of 
repenoire cloning (Persson et al., 1991). 

Antibodies directed to peptides derived from a certaing genotype may be used either for 
the detection of such HCV genotypes, or as ther^eutic agents. 

The present invention also relates to the use of a composition as derlned above for 
incorporation into an immunoassay for detecting HCV, present in biological sample liable to 
contain it, comprising at least the following steps: 

(i) contacting the biological sample to be analyzed for the presence of HCV antibodies 
with any of the compositions as defmed above preferably in an immobilized form 
under appropriate conditions which allow the formation of an immune complex, 
wherein said polypeptide can be a biocinylated polypeptide which is covalently 
bound to a solid substrate by means of sneptavidin or avidin complexes, 

(ii) removing unbound components, 

(iii) incubating the immune complexes formed with heterologous antibodies, which 
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specifically bind co the antibodies present in the smplz to be analyred, with said 
heterologous antibodies having conjugated to a deiectafale label under appropriate 
conditions, 

(iv) detecting the presence of said immunecomple'ss visually or by means of 
densitometry aad inferring the HCV serotype present from the observed 
hybridization pattern. 

The present invention also relates to the use of a composition as denned above, for 
incorporation into a serotyping assay for detecting one or more serological types of HCV 
present in a biological sample liable to contain it, more particularly for detectmg El and NS^ 
antigens or antibodies of the different types to be detected combined in one' assay format, 
comprising at least the following steps: 

(i) contacting the biological sample to be analyzed for the presence of HCV antibodies 
or antigens of one or more serological types, with at least one of the compositions 
as defmed above, an immobilized form under appropriate conditions which allow 
the formation of an inununecomplex, 

(ii) removing unbound components, 

(iii) incubating the immunecomplexes formed with heterologous antibodies, which 
specifically bind to the antibodies present in the sample to be analyzed, with said 
heterologous antibodies having conjugated to a detectable label under appropriate 
conditions, 

(iv) detecting die presence of said immunecomplexes visually or by means of 
densitomen7 and inferring the presence of one or more HCV serological types 
present from the observed binding pattern. 

Tne present invention also relates to the use of a composition as defmed above, for 
Lmmobiiization on a solid substrate and incorporation into a reversed phase hybridization 
assay, preferably for immobilization as parallel lines onto a solid support such as a membrane 
strip, for determining the presence or the genotype of HCV according to a method as defmed 
above. 

The present invention thus also relates to a kit for determining the presence of HC\ 
genotypes as defmed above present in a biological sample liable to contain them, comprising: 
possibly at least one primer composition containing any primer selected from those 
defmed above or any other HCV type 3 and/or HCV type 4, and/or HCV type 5, 
or universal HCV primers. 
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at least one probe composition as defined above, with said probes being 
preferentially immobilized on a solid substrate, and more preferentially on one and 
the same membrane strip, 

a buffer or components necessary for producing the buffer enabling hybridization 
reaction between these probes and the possibly amplified products to be carried out, 
means for detecting the hybrids resulting from die preceding hybriziation, 
possibly also including an automated scanning and interpretation device for 
inferring the HCV genotypes present in the sample from the observed hybridization 
pattern. 

The genotype may also be detected by means of a type-specific antibody as derlned above, 
which is linked to any polynucleotide sequence that can afterwards be amplified by PCR to 
detect the immune complex formed (Immuno-PCR, Sano et al., 1992); 

The present invention also relates to a kit for determining the presence of HCV antibodies 
as defmed above present in a biological sample liable to contain them, comprising: 

at least one polypeptide composition as defmed above, preferentially in combLoaiion 
with other polypeptides or peptides from HCV type 1, HCV type 2 or other types 
of HCV, with said polypeptides being preferentially unmobilized on a solid 
substrate, and more preferentially on one and the same membrane strip, 
a buffer or components necessar/ for producing the buffer enabling bmding 
reaction between these polypeptides and the antibodies against HCV present in the 
biological sample, 

means for detecting the immunecomplexes formed in the preceding binding 
reaction, 

possibly also includiag an automated scanning and interpretation device for 
inferring the HCV genot\T3es present in the sample from the observed binding 
pattern. 
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Fi^re Legends 

Figure 1 

Alignment of consensus nucleotide sequences for each of ihe type 3a isolates BR34, BR36, 
and BR33, deduced from the clones with SEQ ID NO 1, 5, 9; type 4 isolates GB48, GBI16. 
GB215, GB358, GB549, GB809. CAM600. CAMG22, GB438, CAR4/1205, CAR1/50I 
(SEQ ID NO. 106, 108, 110, 112, 114, 116, 201, 203, 205, 207, 209 and 211); type 5a 
isolates BE95 and BE96 (SEQ ID NO 159 and 161) and type 2d isolate NE92 (SEQ ID NO 
145) from the region between nucleotides 7932 and 8271, with known sequences from the 
corresponding region of isolates HCV-1, HCV-J, HC-J6, HC-J8, Tl and T9^ and others as 
shown in Table 3. 

Figure 2 

Alignment of amino acids sequences deduced from the nucleic acid sequences as 
represented in Figure 1 from the subtype 3a clones BR34 (SEQ ID NO 2, 4), BR36 (SEQ ID 
NO 6, 8) and BR33 (SEQ ID NO 10, 12), the subtype 3c clone BE98 (SEQ ID NO 150), and 
the type 4 clones GB48 (SEQ ID NO 107), GB116 (SEQ ID NO 109), GB215 (SEQ ID NO 
111), GB358 (SEQ ID NO 113), GB549 (SEQ ID NO 115) GB809 (SEQ ID NO 117); 
CAM600, CAMG22, GB438, CAR4/ 1205. CAR 1/501 (SEQ ID NO 202, 204, 206, 208, 
210, 212); the type 5a clones BE95 and BE96 (SEQ ID NO 160 and 162); as well as the 
subtype 2d isolate NE92 (SEQ ID NO 146) from the region between amino acids 2645 to 
2757 with known sequences from the corresponding region of isolates HCV-l, HCV-J, HC- 
J6, and HC-J8, Tl and T9, and other sequences as shown in Table 3. 

Figure 3 

Aligment of type 2d, 3c, 4 and 5a nucleotide sequences from isolates NE92, BE98, 
GB358. GB809, CAM600, GB724, BE95 (SEQ ID NO 143, 147, 191, 163, 165, 193 and 
151) in the Core region between nucleotide positions 1 and 500, with known sequences from 
tht corresponding region of type 1, type 2, type 3 and type 4 sequences. 

Figure 4 

Alignment of nucleotide sequences for the subtype 2d isolate NE92 (SEQ ID NO 143), the 
type 4 isolates GB358 (SEQ ID NO 118 and 187), GB549 (SEQ ID NO 120 and 175), and 
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GB809-2 (SEQ ID NO 122 and 169), GB 809^, BG116, GB215, CAM600, CAMG22, 
CAMG27, GB438, CAR4/1205, CAR4/901 (SEQ ID NO 189, 183, 185, 167, 171, 173, 177, 
179, 181), sequences for each of the subtype 3a isolates HDIO, BR36, and BR33, (SEQ ID 
NO 13, 15. 17 (HDIO), 19, 21 (BR36) and 23 , 25 or 27 (BR23) and Ae subtype 5a isolates 
BE95 and BEIOO (SEQ ID NO 143 and 195) from the region between nucleotides 379 and 
957, with known sequences from the corresponding region of type 1 and 2 and 3. 

Figure 5 

Alignment of amino acid sequences deduced from the new HCV' nucleotide sequences of 
the Core/El region of isolates BR33, BR36, HDIO, GB358, GB549, and GB809, PC or 
BE95, C\M600, and GB724 (SEQ ID NO. 14, 20, 24, 119 or 192, 121, 123 or 164 , 54 or 
152, 166 and 194) from the region between positions 1 and 319, with l-cnown sequences from 
type la (HCV-1), type lb (HCV-J), c>T3e 2a (HC-JG), type 2b (HC-J8), NZLl, HCV-TR, 
positions 7-89 of type 3a (E-bl), and positions 8-88 of type 4a (EG-29). V-Core, variable 
region with type-specific features Ln die core protein, VI, variable region I of the £1 protein, 
V2, variable region 2 of the El protein, V3, variable region 3 of the El protein, V4, variable 
region 4 of the El protein, V5, variable region 5 of the El protein. 

Figure 6 

Alignment of nucleotide sequences of isolates HCCL53, HDIO and BR36, deduced from 
clones with SEQ ID NO 29, 31, 33, 35, 37 and 39, from die NS3/4 region between 
nucleotides 4664 to 5292, with known sequences from the corresponding region of isolates 
HCV-l, HCV-J, HC-J6, and HC-J8, EBl, EB2, EB6 and EB7. 

Figure 7 

Alignment of amino acid sequences deduced from the new HCV nucleotide sequences of 
the NS3/NS4 region of isolate BR36 (SEQ ID NO 36, 38 and 40) and BE95 (SEQ ID NO 
270). NS4-1, indicates the region that was synthesized as synthetic peptide 1 of the NS4 
region, NS4-5, indicates the region that was synthesized as synthetic peptide 5 of the NS4 
region; NS4-7, indicates the region that was synthesized as synthetic peptide 7 of die NS4 
regioQ. 

Figure 8 
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Reactivity of the three LfPA-selected (Stuyvcr ct al.. 1993) type 3 sera on the Inno-LIA 
HCV Ab II assay (Innogenetics) (left), and on the NS4-LIA test. For the NS4-LIA test, NS4- 
1, NS4-5, and NS4-7 peptides were synthesized based on the type 1 (HCV-1), type 2 (HC-J6) 
and type 3 {BR36) proiotype isolate sequences as shown in Table 4, and applied as parallel 
lines onto a membrane strip as indicated, I. serum BR33, 2, senim HDIO, 3, serum DKH. 

Figure 9 

Nucleotide sequences of Core/El clones obtained from the PCR fragments PC-2, PC-3, 
and PC-4, obtained from serum BE95 (PC-2-1 (SEQ ID NO 41), PC-2-6 (SEQ ID NO 43), 
PC-4-1 (SEQ ID NO 45), PC-4.6 (SEQ ID NO 47), PC-3-4 (SEQ ID NO 49) /and PC-3-8 
(SEQ ID NO 51)) of subtype 5a isolate BE95. 

A consensus sequence is shown for the Core and El region of isolate BE95, presented as 
PC C/El widi SEQ ID NO 53. Y, C or T, R, A or G, S, C or G. 

Figure 10 

Alignment of nucleotide sequences of clones with SEQ ID NO 197 and 199 (PC sequences, 
see also SEQ ID NO 55, 57, 59) and SEQ ID NO 35, 37 and 39 (BR36 sequences) from the 
NS3/4 region betv^^een nucleotides 3856 to 5292, with known sequences from the 
corresponding region of isolates HCV- 1, HCV-J, HC-J6, and HC-J8, 

Figure 11 

Alignment of amino acid sequences of subtype 5a BE95 isolate PC clones with SEQ ID 
NO 56 and 58, from the NS3/4 region between amino acids 1286 to 1764, with known 
sequences from the corresponding region of isolates HCV- 1, HCV-J, HC-J6, and HC-J8. 

Figure 12 

Aligment of amino acid sequences of subtype 5a isolate BE95 (SEQ" ID NO 158) in the 
E1/E2 region spanning positions 328 to 546, with known sequnces from the corresponding 
region of isolates HCV-1, HCV-J, HC-J6, HC-J8, NZLl and HCV-TR (see Table 3). 

Figure 13 

Alignment of the nucleotide sequences of subtype 5a isolate BE95 (SEQ ID NO 157) in 
the E1/E2 region with known HCV sequences as shown in Table 3. 
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EXAiVIPLES 

Example 1 : The NS5b region of HCV tvpe 3 

Type 3 sera, selected by means of the INNO-LiPA HCV research kit (Scuyver et al. , 1993) 
from a number of Brazilian blood donors, were positive in the HCV antibody ELISA 
(Innotest HCV Ab II; Innogenetics) and/or in the INNO-LIA HCV Ab II confirmation test 
(Innogenetics), Only those sera that were positive after the first round of PCR reactions 
(Scuyver et al., 1993) were retained for funher smdy. 

Reverse transcription and nested PCR: RNA was extracted from 50 /xl serum and subjected 
to cDNA synthesis as described (Stuyver et al., 1993). Ta'is cDNA was used as template for 
PCR, for which the total volume was increased to 50 ;zl contammg 10 pmoles of each primer, 
3 ixi of lOx Pfu buffer 2 (Suatagene) and 2.5 U of Pfu DNA polymerase (Stratagene). The 
cDNA was amplified over 45 cycles consisting of 1 mm 9^*C. 1 min 50X and 2 min 72X. 
The amplirled products were separated by electrophoresis, isolated, cloned and sequenced as 
described (Stuyver et ai., 1993). 

Type 3a and 3b-specific primers in the NS5 region were selected from the published 
sequences (Mori et al., 1992) as follows: 
for type 3a: 

HCPrl61(+): 5^-ACCGGAGGCCAGGAGAGTGATCTCCTCC-3' (SEQ ID NO 63) and 
HCPrl62(-): 5*-GGGCTGCTCTATCCTCATCGACGCCATC-3^ (SEQ ID NO 64); 
for type 3b: 

HCPrl63( + ): 5^-GCCAGAGGCTCGGAAGGCGATCAGCGCT-3^ (SEQ ID O 65) and 
HCPrl64(-): 5'-GAGCTGCTCTGTCCTCCTCGACGCCGCA-3' (SEQ ID NO 66) 
Usmg the Line Probe Assay (LiPA) (Stuyver et ai., 1993), seven high-titer type 3 sera 
were selected and subsequently analyzed with the primer sets HCPrl61/l62 for type 3a, and 
HCPrl63/164 for type 3b. None of these sera was positive with the t>npe 3b primers. NS5 
PCR fragments obtained using the type 3a primers from serum BR36 (BR36-23), serum BR33 
(BR33-2) and serum BR34 (BR34-4) were selected for cloning. The following sequences were 
obtained from the PCR fragments ; 
From fragment BR34-4: 
BR34-4-20 (SEQ ID NO 1), BR34-4-19 (SEQ ID NO 3) 

From fragment BR36-23: 
BR36-23-I8 (SEQ ID NO 5), BR36-23-20 (SEQ ID NO 7) 
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From fragment BR33-2: 
BR33-2-17 (SEQ ID NO 9), BR33-2-21 (SEQ ID NO II) 

An alignment of sequences with SEQ ID NO 1, 5 and 9 with known sequences is given 
in Figure 1. An alignment of the deduced axino acid sequences is shown in Figure 2. The 
3 isolates are very closely related to each other (mutual homologies of about 95 %) and to the 
published sequences of type 3a (Mori et ai., 1992), but are only distantly related to type 1 
and type 2 sequences (Table 5). Therefore, it is clearly demonstrated that NS5 sequences 
from LiPA-selected type 3 sera are indeed derived from a type 3 genome. Moreover, by 
analyzing die NS5 region of serum BR34, for whicti no 5'UR sequences were determmed as 
described in Stuyver et al. (1993), the excellent correlation between t^-ping by means of ihe 
LiPA and genotyplng as deduced from nucleotide sequencmg was further proven. 

Example 2: The Core.^1 region of HCV tvpe 3 

After aligning the sequences of HCV-1 (Choo ec al., 1991), HCV-J (Kato et al., 1990), 
HC-J6 (Okamoto etal., 1991), and KC-J8 (Okamoco et al., 1992), PGR prmers were chosen 
in those regions of liale sequence variation. Pruners HCPr23(-r): 5*- 
CTCATGGGGTACArrCCGCT-3' (SEQ ID NO 67) and HCPr5i(-): 5^- 
TATTACCAGTTCATCATCATATCCCA-3' (SEQ ID NO 68), were synthesized on a 392 
DNA/RNA synthesizer (Applied Biosystems). Tnis set of primers was selected to ampUf]/ 
the sequence from nucleotide 397 to 957 encoding amino acids 140 to 319 (Kato ec al. , 1990): 
52 amino acids from the carboxyterminus of core and 128 amino acids of El (Kato et al., 
1990), The amplification products BR36-9, BRR33-1, and HDlO-2 were cloned as described 
(Stuyver et al., 1993). The foliowLag clones were obtained from the PGR fragments: 
From fragment HDlO-2: 

HDlO-2-5 (SEQ ID NO 13), HDlO-2-14 (SEQ ID NO 15), HDlO-2^21 (SEQ ID NO 17) 
From fragment BR36-9: 

BR36-9-13 (SEQ ID NO 19), BR36-9-20 (SEQ ID NO 21), 
From fragment BR33-I: 

BR33-1-10 (SEQ ID NO 23), BR33-M9 (SEQ ID NO 25), BR33-1-20 (SEQ ID NO 27), 

An alignment of the type 3 El nucleotide sequences (HDIO, BR36, BR33) with SEQ ID 
NO 13, 19 and 23 with known El sequences is presented in Figure 4, Four variations were 
detected in the El clones from serum HDIO and BR36, while only 2 were found m BR33. 
All are sUent third letter variations, with the exception of mutations at position 40 (L to P) 
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and 125 (M to I). The homologies of the type 3 El region (without core) with type 1 and 2 
prototype sequences are depicted in Table 5, 

In total, 8 clones covermg the core/El region of 3 different isolates were sequenced and 
the El portion was compared with the known genot\T)es (Table 3) as shown in Figure 5. 
After computer analysis of the deduced amino acid sequence, a signal-anchor sequence at ±e 
core carboxyterminus was detected which might, through analogy with type lb (Hijikata et 
al., 1991), promote cleavage before the LEWRN sequence (position 192, Fig. 5). Tae L-to-P 
mutation in one of the HDlO-2 clones resides m this signal-anchor region and potentially 
impairs recognition by signal peptidase (computer prediction). Smce no examples of such 
substimtions were found at this position in previously described sequences, this mutation 
might have resulted from reverse transcriptase or Pfu polymerase mismcorporanon. The 4 
amino- terminal potential N-linked glycosylation sites, which are also present in HCV t^^jes 
la and 2, remain conserved m type 3. The N-glycosylation site m type lb (aa 250. Kato et 
al.. 1990) remains a umque feature of this subtype. All El cystemes, and the putative 
transmembrane region (aa 264 to 293, computer prediction) containing the aspanic acid at 
position 279, are conserved m all three HCV types. The following hypervariable regions can 
be delineated: VI from aa 192 to 203 (numbermg according to Kato et al., 1990), V2 (213- 
223), V3 (230-242), V4 (248-257), and V5 (294-303). Such hydrophilic regions are thought 
to be exposed to the host defense mechanisms. This variability might therefore have been 
induced by the host's imm une response. Additional putative N-linked glycosylation sites Ln 
the V4 region in all type lb isolates known today and in the V5 region of HC-J8 (type 2b) 
possibly further contribute to modulation of the immune response. Therefore, analysis of this 
region, in the present invention, for type 3 and 4 sequences has been mstrumeatai in the 
delineation of epitopes that reside m the V-regions of El, which will be critical for future 
vaccine and diagnostics development. 

Example 3: The NS3/NS4 region of HCV Tvpe 3 

For the NS3/NS4 border region, the folUowing sets of primers were selected m ihe regions 
of little sequence variability after aligning the sequences of HCV- 1 (Chooetal.. 1991), HCV- 
J (Kato et al., 1990), HC-J6 (Okamoto et al., 1991), and HC-J8 (Okamoto et al.. 1992) 
(smaller case lettering is used for nucleotides added for cloning purposes): 
set A: 

HCPrll6(-h): 5'-tUL\AATACATCATGRC[TGYATG-3- (SEQ ID NO 69) 
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HCPr66 (-): 5'-ctatta"TTGTATCCCRCTGATGAARTTCCACAT-3' (SEQ ID NO 70) 
set B: 

HCPrll6(+): 5"-ttttAAATACATCATGRCITGYATG-3" (SEQ ID NO 69) 
HCPrl 18(-): 5'-actagtcgactaYTGlATICCRCTIATR%VARTTCCACAT-3' (ScQ [D N071) 
set C: 

HCPrll7(+): S'-ttttAAATACATCGCIRCITGCATGCA-S' (SEQ ID NO 72) 
HCPr66 (-): 5'-ctattaTTGTATCCCRCTGATGAARTTCCACAT-3' (SEQ ID NO 70) 
set D: 

HCPrll7(+): 5'-tmAAATACATCGCIRCITGCATGCA-3' (SEQ ID NO 72) 
HCPrl 18(-): 5'-actagtcgactaYTGIATICCRCTIATRWARTTCCACAT-3' (SEQ ID N071j 
sec E: 

HCPrll6(+): S'-ttuAAATACATCATGRCITGYATGO' (SEQ ID NO 69) 
HCPrl 19(-): actagicgactaRTTIGCIATIAGCCG.TRTTCATCCA^TGo' (SEQ ID NO 73) 
set F: 

HCPrl 17(-r-): 5'-ttttAAATACATCGCIRCITGCATGCA-3' (SEQ ID NO 72) 
HCPrl 19(-): actagtcgactaRTTIGCLA.TIAGCCG/TRTTCATCCA\TG-3' (SEQ ID NO 73) 
set G: 

HCPrI31(+): 5'-ggaattctagaCCITCITGGGAYGAR.\YITGGA.\RTG-3' (SEQ ID NO 74) 
HCPr66 (-): 5'-ctatiaTTGTATCCCRCTGATGAARTrCCACAT-3* (SEQ ID NO 70) 
set H: 

HCPrl30(+): 5'-ggaattctagACIGCITAYCARGCIACIGTrTGYGC-3' (SEQ ID NO 75) 
HCPr66 (-): 5'-ctattaTTGTATCCCRCTGATGAARTTCCACAT-3' (SEQ ID NO 70) 
set I: 

HCPrI34(+): 5'-CATATAGATGCCCACTTCCTATC-3' (SEQ ID NO 76) 
HCPr66 (-): 5"-ctattaTTGTATCCCRCTGATGAARTTCCACAT-3' (SEQ ID NO 70) 
set J: 

HCPrl31(+): 5'-ggaattctagaCCITCITGGGAYGAR.\YITGGAARTG-3- (SEQ ID NO 74) 
HCPrl 18(-): 5'-actagtcgactaYTGIATICCRCTIATRWARTTCCACAT-3' (SEQ ID NO 
71) 

set K; 

HCPrl30(+): 5'-ggaanctagACIGCITAYCARGCIACIGTITGYGC-3" (SEQ ID NO 75) 
HCPrl 18(-): 5'-actagtcgactaYTGIATICCRCTIATRWARTTCCACAT-3- (SEQ ID NO 
71) 
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set L: 

HCPrl34(-r): 5'-CATATAGATGCCCACTTCCTATC-3' (SEQ ID NO 76) 
HCPrI 18{-): 5'-aciagtcgaciaYTGIATrCCRCTIATRWARTTCCACAT-3' (SEQ ID N07I) 
se: M: 

HCPr3(-r): 5'-GTGTGCCAGGACCATC-3' (SEQ ID NO 77) and 
HCPr4(-): 5'-GACATGCATGTCATGATGTA'3 (SEQ ID NO 78) 
set N: 

HCPr3( + ): 5'-GTGTGCCAGGACCATC-3' (SEQ ID NO 77) and 
HCPrI18(-): 5'-actagtcgactaYTGIATICCRCmATRWARTTCCACAT-3' (SEQ ID NO 71) 
set 0: 

HCPr3(+): S'-GTGTGCCAGGACCATC-S' (SEQ ID NO 77) and 
HCPr66 (-): 5'-C'-anaTTGTATCCCRCTGATGAARTTCCACAT-3^ (SEQ ID NO 70) 
No PGR produce could be obtained with the sets of primers A, B, C, D, E, F, G, H, I, 
J, K, L, M, and N. on random-primed cDNA obtained from type 3 sera. With the primer set 
O, no fragment could be amplified from type 3 sera. However, a smear containing a few 
weakly stainable bands was obtained from serum BR36. After sequence analysis of several 
DNA fragments » purified and cloned from the area around 300 bp on the agarose gel, only 
one clone, HCC155 (SEQ ID NO 29), was shown to contain HCV information. Tnis 
sequence was used to design primer HCPri52. 

A new primer set P was subsequently tested on several sera, 
set P: 

HCPrl52(+): 5*.TACGCCTCTrCTATATCGGTTGGGGCCTG-3' (SEQ ID NO 79) and 
HCPr66(-): 5^-CTATTATTGTATCCCRCTGATGAARTTCCACAT-3' (SEQ ID NO 70) 
The 464-bp HCPrI52/66 fragment was obtauied from serum BR36 (BR36-20) and serum 
HDIO (HDlO'l). The following clones were obtained from these PGR products: 

From fragment HDlO-1; 
HDlO-1-25 (SEQ ID NO 31), HDlO-lo (SEQ ID NO 33), 

From fragment BR36-20: 
BR36-20-164 (SEQ ID NO 35), BR36-20-I65 (SEQ ID NO 37). BR36-20-I66 (SEQ ID 
NO 39), 

The nucleotide sequences obtained from clones with SEQ ID NO 29, 31, 33, 35, 37 or 
39 are shown aligned with the sequences of prototype isolates of other types of HCV m 
Figure 6. In addition to one silent 3rd letter variation, one 2nd letter mutation resulted in an 
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E to G substitution at position 175 of the deduced amino acid sequence of BR36 (Fig. 7). 
Serum HDIO clones were completely identical. The two type 3 isolates were nearly 94% 
homologous in this NS4 region. Tne homologies with other types are presented in Table 5. 

Example 4: Analysis of the 3Jiti-NS4 response to type-specific peptides 

As the NS4 sequence contains the information for an important epitope cluster, and since 
antibodies towards this region seem lo exhibit little cross-reactivity (Chan et al., 1991), it was 
worthwhile to investigate the type-specific antibody response to this region. For each of the 
3 genotypes, HCV-l (Choc et ai., 1991), HC-J6 (Okamoto et ai., 1991) and BR36 (present 
invention), three 20-mer peptides were synthesized covering the epitope region between ammo 
acids 1688 and 1743 (as depicted m rabie 6). The synthetic peptides were applied as paxailei 
lines onto membrane strips. Detection of anti-NS4 antibodies and color development was 
performed according to the procedure described for the INNO-LIA HCV Ab II kit 
(Innogenetics, Antwerp). Peptide synthesis was carried out on a 9050 PepSynthesizer 
(Millipore). After incubation with 15 LiPA-selected type 3 sera, 9 samples showed reactivit\' 
towards NS4 peptides of at least 2 different types, but a clearly positive reaction was 
observed for 3 sera (serum BR33, HD30 and DKH) on die type 3 peptides, while negative 
(serum BR33 and HD30) or mdeterminate (serum DKH) on the type I and Dr^pe 2 NS- 
peptides; 3 sera tested negative for anii-NS4 antibodies (Figure 8). Using the same membrane 
strips coated with the 9 peptides as indicated above and as shown m Figure 8, 38 type 1 sera 
(10 type la and 28 type lb), 11 type 2 sera (10 type 2a and 1 type 2b), 12 type 3a sera and 
2 type 4 sera (as determined by the LiPA procedure) were also tested. As shown in Table 8, 
the sera reacted in a genotype-specific manner with the NS4 epitopes. These results 
demonstrate that type-specific anti-NS4 antibodies can be detected in the sera of some 
patients. Such genorv-pe-specifx synthetic peptides might be employed to develop serotypmg 
assays, for example a mixture of ±e nine peptides as indicated above, or combmed with the 
NS4 peptides from the HCV type 4 or 6 genotype or from new genotypes corresponding to 
the region between amino acids 1683 and 1743, or synthetic peptides of the NS4 region 
between amino acids 1688 and 1743 of at least one of the 6 genotypes, combmed with the El 
protein or deletion mutants thereof, or synthetic El peptides of at least one of the genotypes. 
Such compositions could be further extended with type-specific peptides or proteins, includmg 
for example the region between ammo acids 68 and 91 of the core protein, or more 
preferably the region between amino acids 68 and 78. Furthermore, such rv^e-specitic 
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antigens may be advantageously used to improve current diagnostic screening and 
confirmation assays and/or HCV vaccines. 

Example 5 The Core and El regions of HCV tyye 5 

Sample BE95 was selected from a group of sera that reacted positive in a prototype Line 
Probe Assay as described earlier (Stuyver et al,, 1993), because a high-titer of HCV RNA 
could be detected, enablijig cloning of fragments by a single round of PCR. As no sequences 
from any coding region of type 5 has been disclosed yet, syndetic oligonucleotides for PCR 
amplification were chosen in the regions of little sequence variation after aligning the 
sequences of HCV-1 (Chco et al., 1991), HCV-J (Kato et al., 1990), HC-J6 (Okamoto et 
al., 1991), HC-J8 (Okamoto et al., 1992), and the new type 3 sequences of the present 
invention HDIO, BR3j, and BR36 (see Figure 5, Example 2). The following sets of primers 
were synthesized on a 392 DNA/RNA^ synthesizer (Applied Biosysiems): 
Set 1: 

HCPr52(^): 5'-atgTTGGGT.A.AGGTCATCGATACCCT-3' (SEQ ID NO 80) and 
HCPr54(-): 5^-ctatTaCCAGTTCATCATCATATCCCA-3' (SEQ ID NO 78) 
Set 2: 

HCPr41( + ): 5'-CCCGGGAGGTCTCGTAGACCGTGCA-3' (SEQ ID NO 81) and 
HCPr40(-): 5^-ctattaAAGATAGAGAAAGAGCAACCGGGo'(SEQ ID NO 82) 
Set 3: 

HCPr41( + ): 5'-CCCGGGAGGTCTCGTAGACCGTGCA-3^ (SEQ ID NO 81) and 
HCPr54(-): 5'-ccattaCCAGTTCATCATCATATCCCA-3' (SEQ ID NO 78) 
The three sets of primers were employed to amplify the regions of the type 5 isolate PC 
as described (Stuyver et al., 1993). Set 1 was used to amplify the El region and yielded 
fragment PC-4, set 2 was designed to yield the Core region and yielded fragment PC-2. Set 
3 was used to amplify the Core and El region and yielded fragment PC-3. These fragments 
were cloned as described (Smyver et al., 1993). The following clones were obtained from the 
PCR fragments: 

From fragment PC-2: 
PC'2-1 (SEQ ID NO 41), PC-2-6 (SEQ ID NO 43), 

From fragment PC-4: 
PC-4-1 (SEQ ID NO 45). PC-4.6 (SEQ ID NO 47), 
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From fragmeni PC-3: 
PC-3-4 (SEQ ID NO 49), PC-3-8 (SEQ ID NO 51) 

Ad alignment of sequences with SEQ ID NO 41, 43, 45, 47, 49 and 51, is given in Figure 
9. A consensus amino acid sequence (PC C/EI: SEQ ID NO 54) can be deduced from each 
of the 2 clones cloned from each of the three PCR fragments as depicted in Figure 5, which 
overlaps the region between nucleotides I and 957 (Kato et al., 1990). Tae 6 clones are very 
closely related to each other (mutual homologies of about 99.7%). 

An alignment of nucleotide sequence with SEQ ID NO 53 or 151 (PC C/El from isolate 
BE95) with Icnown nucleotide sequences from the Core/El region is given in Figure 3. The 
clone is only distantly related to type 1, type 2, type 3 and rype 4 sequences {Table 5). 

Example 6 : NS3/NS4 region of HCV tvpe 5 

Attempts were undertaken to clone the NS3/NS4 region of the isolate BE95, described in 
example 5. The folllowing sets of primers were selected in the regions of little sequence 
variability after aligning the sequences of HCV-l (Choo et al., 1991), HCV-J (Kato et al., 
1991), HC-J6 (Okamoto et ai., 1991), and HC-J8 (Okamoto et al., 1992) and of the 
sequences obtained from type 3 sera of the present invention (SEQ ID NO 31, 33, 35, 37 and 
39); smaller case lettering is used for nucleotides added for cloning purposes; 
set A: 

HCPrll6(^): 5'-ttttAAATACATCATGRCITGYATG-3^ (SEQ ID NO 66) 
HCPr66 {-): 5''CtattaTTGTATCCCRCTGATGAARTrCCACAT-3^ (SEQ ID NO 70) 
set B: 

HCPrll6(+): 5'-ttttAAATACATCATGRCITGYATG-3' (SEQ ID NO 69) 
HCPrl 18(-): 5'-actagtcgactaYTGIATICCRCTIATRWARTrCCACAT-3' (SEQ ID NO 71) 
set C: 

HCPrl 17(+): 5'-ttttAAATACATCGCIRCITGCATGCA-3^ (SEQ ID NO 72) 
HCPr66 (-): 5'-ctattaTTGTATCCCRCTGATGAARTTCCACAT-3^ (SEQ ID NO 70) 
set D: 

HCPrl 17(-f): 5'-ttttAAATACATCGCIRCITGCATGCA-3' (SEQ ID NO 72) 

HCPrl 18(-):5'-actagtcgactaYTGIATICCRCnATRWARTTCCACAT-3' (SEQ ID N071) 

set E: 

HCPrl 16(-h): 5'-ttttAAATACATCATGRCITGYATG-3' (SEQ ID NO 69) 

HCPrl 19(-): actagtcgactaRTTIGClATIAGCCG/TRTTCATCCAYTG-3' (SEQ ID NO 73) 
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set F: 

HCPrll7(+): 5"-ttttAAATACATCGCIRCITGCATGCA-3' (SEQ ID NO 72) 
HCPrll9(-): actagtcgactaRTTIGClATIAGCCG/TRTrCATCCAYTG-3' (SEQ ID NO 73) 
sec G: 

HCPrl3 1( + ): 5--ggaattct2gaCCITCITGGGAYGARAYITGGAARTG-3' (SEQ ID NO 7^) 
HCPr66 (-); 5--ctattaTTGTATCCCRCTGATGAARTTCCACAT-3' (SEQ ID NO 70) 
set H: 

HCPrl30(+): 5'-ggaattctagACIGCrTAYCARGCIACIGTITGYGC-3' (SEQ ID NO 75) 
HCPr66 (-): 5'-ctattaTTGTATCCCRCTGATGAARTTCCACAT-3' (SEQ ID NO 70) 
set I: 

HCPrl34(+): 5--CATATAGATGCCCACTTCCTATC-3' (SEQ ID NO 76) 
HCPr66 (-): 5'-ctattaTTGTATCCCRCTGATGAARTTCCACAT-3' (SEQ ID NO 70) 
set J: 

HCPrl3I( + ): 5'-ggaattctagaCCITCITGGGAYGAR.\YITGGAARTG-3' (SEQ ID 74) 
HCPrll8(-): 5'-actagtcgac:aYTGlATICCRCTIATRWARTTCCACAT-3' (SEQID N07I) 
set K: 

HCPrI30(+): 5'-ggaattctagACIGCITAYCARGCIACIGTITGYGC-3' (SEQ ID NO 75) 
HCPrl I8(-): 5'-actagtcgactaYTGIATICCRCTL\TRWARTTCCACAT-3' (SEQID N071) 
set L: 

HCPrl34(-f-): 5"-CATATAGATGCCCACTTCCTATC-3' (SEQ ID NO 76) 
HCPril8(-):5"-actagtcgactaYTGIATICCRCTIATRWARTTCCACAT-3'(SEQ IDN071) 
set M: 

HCPr3(+): 5'-GTGTGCCAGGACCATC-3' (SEQ ID NO 77) and 
HCPr4(-): 5'-GACATGCATGTCATGATGTA-3' (SEQ ID NO 78) 
set N; 

HCPr3(-f): 5'-GTGTGCCAGGACCATC-3' (SEQ ID NO 77) and 
HCPrl 18(-): 5'-actagtcgactaYTGIATICCRCTIATRWAR'TTCCACAT-3' (SEQ ID NO 
71) 

set O: 

HCPr3(-r): 5'-GTGTGCCAGGACCATC-3' (SEQ ID NO 77) and 

HCPr66 (-); 5'-ctattaTTGTATCCCRCTGATGAARTTCCACAT-3' (SEQ ID NO 70) 

No PGR products could be obtained with the sets of primers A, B, C. D, E, F, G, 
H, I, J, K, L, M, and N, on random-primed cDNA obtained from type 3 sera. However, 



SUBSTITUTE SHEET (RULE 26) 



fv ^-1'^^ ^. ;Fi iFii ^"iil 3 fJ O Ci^ 

wo 94/25601 , PCT/EP94/01323 

58 

set 0 yielded what appeared to be a PCR artifact fragment estimated about 1450 ba^e 
pairs, instead of the expected 628 base pairs. Although it is not expected that PCR anifact 
fragments contain information of the gene or genome that was targetted in the experiment, 
efforts were put in cionuig of this artifact fragment, which was designated fragznent PC-L 
The following clones, were obtained from fragment PC-1: 

PC-1-37 (SEQ ID NO 59 and SEQ ID NO 55), PC-1-48 (SEQ ID NO 61 and SEQ ID MO 

57) 

The sequences obtaioed from the 5' and 3' ends of the clones are given in SEQ ID NOS 
55, 57, 59, and 61, and the complete sequences with SEQ ID NO 197 and 199 are shown 
aligned with the sequences of prototype isolates of other types of HCV in Figure- 10 and dae 
alignment of the deduced amino acid sequences is shown in Figure 11 and 7. Surprismgly, 
the PCR artifact clone contained HCV information. The positions of the sequences wichm the 
HCV genome are compatible with a contiguous HCV sequence of 1437 nucleotides, which 
was the estimated size of the cloned PCR artifact fragment. Pruner HCPr66 primed correctly 
at the expected position in the HCV genome. Therefore, prmier HCPr3 must have 
incidentally mispriined at a position 809 nucleotides upstream of its legitimate position in the 
HCV genome. This could not be expected since no sequence information was available from 
a coding region of type 5. 

Example 7 : The E2 region of HCV tvpe 5 

Serum BE95 was chosen for experiments aimed at amplifying a part of the E2 region of HCV 
type 5. 

After aligning the sequences of HCV-1 (2), HCV-J(l), HC-J6 (3), and HC-J8 (4), PCR 
primers were chosen in those regions of little sequence variation. 

Primers HCPrl09(-h): 5'-TGGGATATGATGATGAACTGGTC-3' (SEQ ID NO 141) and 
HCPrl4(-): 5^-CCAGGTACAACCGAACCAATTGCC-3' (SEQ ID NO 142) were combmed 
to amplify the amixioterminal region of the E2/NSI region, and were synthesized on a 392 
DNA/RNA synthesizer (Applied Biosysteras). With primers HCPrl09 and HCPrl4, a PCR 
fragment of 661 bp was generated, containing 169 nucleodndes corresponding to the El 
carboxyterminus and 492 bases from the region encoding the E2 aminotermmus. 

An alignment of die t\T)e 5 E1/E2 sequences with seq ID NO. 158 with known sequences is 
presented in Figure 10, The deduced protein sequence was compared with the different 
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genotypes (Fig. 12, amino acids 328-546). In the El region, there were no extra structural 
important motifs found. The aminoterminal pan of E2 was hypervariable when compared 
with the other genocypes. Ail 6 N-glycosylation sites and all 7 cysteine residue's were 
conserved in this E2 region. To preserve alignment, it was necessary to introduce a gap 
between aa 474 and 475 as for type 3a, but not between aa 480 and 481, as for type 2. 

Example 8 : The NS5b region of HCV tvpe 4 

Type 4 sera GB48, GB116, GB215, and GB358, selected by means of the line probe assay 
(LiPA, Stuyver et al., 1993), as well as sera GB549 and GB809 that could not be typed by 
means of this LiPA (only hybridization was observed with the universal probes), were 
selected from Gabonese patients. Ail diese sera were positive after the first round of PGR 
reactions for the 5' untranslated region (Sruyver et al., 1993) and were retamed for furdier 
study. 

RNA was isolated from the sera and cDNA synthesized as described m example I. 
Universal primers in the NS5 region were selected after alignment of the published sequences 
as follows: 

HCPr206( 4-) : 5 ' -TGGGG ATCCCGTATGATACCCGCTGCTTTG A-3 ' 
(SEQ ID NO. 124) and 

HCPr207(-): 5'-GGCGGAATTCCTGGTCATAGCCTCCGTGAA-3' 
(SEQ ID NO. 125); 

and were synthesized on a 392 DNA/RNA synthesizer (Applied Biosystems). Using the Line 
Probe Assay (LiPA), four high-titer type 4 sera and 2 sera that could not be classified were 
selected and subsequeudy analyzed with the primer set HCPr206/207. NS5 PGR fragments 
obtained using ±ese primers from serum GB48 (GB48-3), serum GBI16 (GB116-3). serum 
GB2I5 (GB215-3), serum GB358 (GB358-3), serum GB549 (GB549-3), and serum GB809 
(GB809-3), were selected for cloning. The following sequences were obtained from the PGR 
fragments: 

From fragment GB48-3 : GB48*3-10 (SEQ ID NO. 106) 
From fragment GB116-3: GBI16-3-5 (SEQ ID NO. 108) 
From fragment GB215-3: GB215-3-8 (SEQ ID NO. 110) 
From fragment GB358-3: GB358'3-3 (SEQ ID NO. 112) 

From fragment GB549-3: GB549-3-6 (SEQ ID NO. 114) 
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From fragment GB809-3: GB809-3-1 (SEQ ID NO. 116) 

An alignment of nucleotide sequences with SEQ ID NO. 106, 108, 1 10, 1 12, 114, and 1 16 
with known sequences is given in Figure 1. An alignment of deduced amino acid sequences 
with SEQ ID NO. 107, 109, HI, 113, 115, and 1 17 with known sequences is given in Figure 
2. The 4 isolates that had been t>^ed as type 4 by means of LiPx\ are very closely related to 
each other (mutual homologies of about 95%), but are only distantly related to type 1, tyne 
2, and type 3 sequences (e.g. GB358 shows homologies of 65.6 to 67.7% with other 
genotypes. Table 4). The sequence obtained from sera GB549 and GB809 also show similar 
homologies with genotypes 1, 2, and 3 (65.9 to 68.8% for GB549 and 65.0 to 68.5% for 
GB809, Table 4), but an intermediate homology of 79.7 to 86.8% (often obser/ed between 
subtypes of the same type) exists between GB549 or GB809 with die group of isolates 
consistmg of GB48, GB116, GB215, and GB358, or between GB549 and GB809. These data 
indicate the discovery of 3 new subPr^jes withm the HCV genotype 4: m the present 
invention, these 3 subt>T}es are designated subtype 4c, represented by isolates GB^S, GBl 16, 
GB215, and GB358, subiy^e 4g, represented by isolate GB549, and subtype 4e, represented 
by isolate GB809. Although the homologies obser\^ed between subtypes in the NS5 region 
seem to indicate a closer relationship be ween subtypes 4c and 4e, the homologies observed 
in the El region indicate that subtypes 4g and 4e show the closest relation (see example 8). 

Example 9 : The Core,^l region of HCV t^^e 4 

From each of the 3 new type 4 subtypes, one representative serum was selected for cloning 
experiments in the Core/El region. GB549 (subtype 4g) and GB809 (subtype 4e) were 
analyzed together with isolate GB358 that was chosen from the subtype 4c group. 
Synthetic oligonucleotides: 

After aligning the sequences of HCV-1 (2), HCV-J(l), HC-J6 (3), and HC-J8 (4). PGR 
primers were chosen in those regions of little sequence variation. 

Primers HCPr52(H-): 5*-atgTTGGGTAAGGTC.>\TCGATACCCT-3\ HCPr23(-h): 5'- 
CTCATGGGGTACATTCCGCT-3\ and HCPr54(-): 5'- 
CTATTACCAGTTCATCATCATATCCCA-3', were synthesized on a 392 DNA/RNA 
synthesizer (Applied Biosystems). The sets of primers HCPr23/54 and HCPr52/54 were used, 
but only with the primer set HCPr52/54, PGR fragments could be obtained. This set of 
primers amplified the sequence from nucleotide 379 to 957 encodmg ammo acids 127 to 319: 
65 amino acids from the carboxyterminus of core and 128 amino acids of El. The 
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amplification produces GB358-4, GB549-4, and GB809-4 were cloned as described in example 
I. The following clones were obtained from the PGR fragments: 
From fragment GB358-4: GB358-4-1 (SEQ ID NO 118) 
From fragment GB5^9-4: GB549-4o (SEQ ID NO 120) 
From fragment GB809-4: G3809-4-3 (SEQ ID NO 122) 

An alignment of die Dr-pe 4 Core/EI nucleotide sequences with seq ID NO. 1 18, 120, and 122 
with known sequences is presented in Figure 4. The homologies of the type 4 Ei region 
(without core) with rjpe 1, type 2, type 3, and type 5 prototype sequences are depicted m 
Table 4. Homologies of 53 to 66% are observed widi representative isolates of non-type 4 
genotypes. Observed homologies in the El region within type 4, between the different 
subtypes, ranges from 75,2 to 78.4%. The recently disclosed sequences of ±t core region 
of Egyptian type 4 isolates (for example EG-29 in Figure 3) described by Simmonds et al. 
(1993) do not allow alignment with the Gabonese sequences (as described in the present 
invention) in the NSB region and may belong to different type 4 subcypes(s) as can be 
deduced from the core sequences. Tae deduced amino acid sequences with SEQ ID NO 119, 
121, and 123 are aligned with other prototype sequences m Figure 5. Again, rype-specific 
variation mainly resides in the variable V regions, designated m the present invention, and 
therefore, type-4-specific amino acids or V regions will be instrumental in diagnosis and 
therapeutics for HCV type 4. 

Example 10 : The Core/El and NS5b regions of new HCV tvoe 2. 3 and 4 subtypes 

Samples NE92 (subtype 2d), BE98 (subtype 3c), CAM600 and GB809 (subtype 4e), 
CAMG22 and CAMG27 (subtype 40, GB438 (subtype 4h), CAR4/1205 subtype (4i), 
CAR 1/501 (subtype 4j), CAR 1/901 (subtype 4?), and GB724 (subtype 4?) were selected from 
a group of sera that reacted positive but aberrantly in a prototype Lme Probe Assay as 
described earlier (Stuyver et al., 1993). Another type 5a isolate BEIOO was also analyzed in 
the C/El region, and yet another type 5a isolate BE96 in the NS5b region. A high-titer of 
HCV RNA could be detected, enabling cloning of fragments by a single round of PGR. As 
no sequences from any codmg region of these subtypes had been disclosed yet, synthetic 
oligonucleotides for PCR amplitlcation were chosen in the regions of little sequence variation 
after aligning the sequences of HCV-1 (Choo etal., 1991), HCV-J(Kato et al., 1990), HC-J6 
(Okamoto et al., 1991), HC-J8 (Okamoto et al., 1992), and the other new sequences of the 
present invention. 
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The above mentioned sets 1, 2 and 3 (see example 5) of primers were used, but only with 
set I, PGR fragments could be obtained from all isolates (except for BE98, GB724, and 
CAR1/50I). This set of primers amplified the sequence from nucleotide 379 to 957 encoding 
amino acids 127 to 319: 65 amino acids from the carboxyterminus of core and 128 amino 
acids of El. With set 3, the core.^l region from isolais NE92 and BE98 could be amplified, 
and with set 2, the core region of GB358. GB724, GB809, and CAM600 could be amplified. 
The amplification products were cloQed as described in example 1. The followmg clones were 
obtained from the PGR fragments: 

From isolate GB724, the clone with SEQ ID NO 193 from the core region. 
From isolate NE92, the clone with SEQ ID NO 143 

From isolate BE98, the clone from the core/El region of which pan of the sequence has been 
analyzed and is given in SEQ ID NO 147, 

From isolate G.\]M600, the clone with SEQ ID NO 167 from the El region, or SEQ ID NO 
165 from the Core/El region as shown in Figure 3, 

From isolate CA2vIG22, the clone with SEQ ID NO 171 from the El region as shown in 
Figure 4, 

from isolate GB358, the clone with SEQ ID NO 191 m the core region,, 
from isolate CAivIG27, the clone with SEQ ID NO 173 from die core/El region, 
from isolate GB438, the clone with SEQ ID NO 177 from the core/ El region, 
from isolate CAR4/1205, the clone with SEQ ID NO 179 from the core/El region, 
from isolate CARl/901, the clone with SEQ ID NO 181 from the core/ El region, 
from isolate GB809, the clone GB809-4 with SEQ ID NO 189 from the core/El region, 
clone GB809-2 with SEQ ID NO 169 from the core/El region and die clone with SEQ ID 
NO 163 from the core region, 

and from isolate BEIOO, the clone with SEQ ID NO 155 from the Gore/EI region as shown 
in Figure 4. 

An alignment of these Core/El sequences with known Core/El sequences is presented in 
Figure 4. The deduced amino acid sequences widi SEQ ID NO 144, 148, 164, 168, 170, 172, 
174, 178, 180, 182, 190, 192, 194, 156, 166 are aligned with other prototype sequences in 
Figure 5. Again, type-specific variation mainly resides in the variable V regioos, designated 
in the present invention, and dierefore, t>pe 2d, 3c and type 4-specific amino acids or V 
regions will be instrumental in diagnosis and therapeutics for HCV type (subtype) 2d, 3c or 
the different type 4 subtypes. 
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The NS5b region of isolates NE92, BE98. CAM600, CAMG22, GB438, CAR4/1205, 
CARl/501, and BE96 was amplified with primers HCPr206 and HCPr207 (Table 7). The 
corresponding clones were cloned and sequenced as in example I and the corresponding 
sequences (of which BE98 was parJy sequenced) received the following identiilcatiou 
numbers: 

NE92: SEQ ID NO 145 
BE98: SEQ ID NO 149 
CAM600: SEQ ID NO 201 
CAMG22: SEQ ID NO 203 
GB438: SEQ ID NO 207 
CAR4/1205: SEQ ID NO 209 
CARl/50i: SEQ ID NO 211 
BE95: SEQ ID NO 159 
BE96: SEQ ID NO 161 

An alignment of these NS5b sequences with known NS5b sequences is presented in Figure 
L The deduced amino acid sequences with SEQ ID NO 146, 150, 202, 204, 206, 208, 210, 
212, 160, 162 are aligned with other prototype sequences in Figure 2. Again, subtype-specific 
variations can be observed, and therefore, type 2d, 3c and type 4-specifiC ammo acids or V 
regions will be instrumental in diagnosis and therapeutics for HCV type (subtype) 2d, 3c or 
the different type 4 subtypes. 

Example 11 : Genotvpe-specific reactivity of anti-El antibodies (Serotvping) 

El proteins were expressed from vaccinia virus constructs containing a core/El region 
extending from nucleotide positions 355 to 978 (Core/El clones described in previous 
examples including the primers HCPr52 and HCPr54), and expressed proteins from LI 19 
(after the initiator methionine) to W326 of the HCV pol>TDrotein. The expressed protein was 
modified upon expression in the appropriate host cells (e.g. HeLa, RK13, HuTK-, HepG2) 
by cleavage between amino acids 191 and 192 of the HCV polyprotein and by die addition 
of high-mannose type carbohydrate motifs. Therefore, a 30 to 32 kDa glycoprotein could be 
observed on western blot by means of detection with serum from patients with hepatitis C. 

As a reference, a genotype lb clone obtained form the isolate HCV-B was also expressed 
in an identical way as described above, and was expressed from recombinant vaccinia virus 
wHCV-llA. 
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A panel of 104 genocyped sera was first tested for reactivity with a cell lysate containing 
type lb protein expressed from the recombinant vaccinia virus wHCV-llA, and compared 
with cell lysate of RK13 cells infected with a wild type vaccinia virus ('ElAVT*). The lysates 
were coated as a 1/20 dilution on a aormal ELISA microtiter place (Nunc maxisorb) and left 
to react with a 1/20 diluacion of the respeciive sera. The panel consisted of 14 type la, 38 
type lb, 21 type 2, 21 type 3a, and 9 type 4 sera. Human antibodies were subsequently 
detected by a goat anti-human IgG conjugated with peroxidase and the enzyme activity was 
detected. The optical density values of the El and wild type lysates were divided and a factor 
2 was taken as the cut-off. The results are given in the table A. Eleven out of 14 type la sera 
(79%), 25 out of 38 type lb sera (66%), 6 out of 21 (29%), 5 out of 21 (24%): and none of 
the 9 type 4 or the type 5 serum reacted (0%). These experiments clearly show the high 
prevalence of anti-El antibodies reactive with die type 1 El protein in patients infected with 
type I (36/52 (69%)) (either type la or type lb), but the low prevalence or absence in non- 
type 1 sera (11/52 (21%)). 

TABLE A 



serum 


El/WT 


type la 




3748 


3.15 


3807 


3.51 


5282 


1.99 


9321 


3.12 


9324 


2.76 


9325 


6.12 


9326 


10.56 


9356 


1.79 


9388 


3.5 


8366 


10.72 


8380 


2.27 


10925 


4.02 


10936 


5.0-t 


10938 


1.36 
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type lb 




5205 


2.25 


5222 


1.33 


5246 


1.24 


5250 


13.58 


5493 


0.87 


5573 


1.75 


8243 


1.77 


8244 


2.05 


8316 


1.21 


8358 


5.04 


9337 


14.47 


9410 


5 


9413 


5.51 


10905 


1.26 


10919 


5.00 


10928 


8.72 


10929 


8.26 


10931 


2.3 


10932 


4.41 


44 


2.37 


45 


3.14 


46 


4.37 


47 


5.68 


48 


2.97 


49 


1.18 


50 


9.85 


51 


4.51 


52 


1.11 


53 


5.20 


54 


0.98 


55 


1.48 


56 


1.06 


57 


3.85 


58 


7.6 


59 


3.28 


60 


3.23 


61 


7.82 


62 


1.92 
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type 2 




23 


0.91 


24 


L16 


25 


2.51 


26 


0.96 


27 


L20 


28 


0.96 


29 


2.58 


30 


8.05 


31 


0.92 


32 


0.82 


33 


5.75 


34 


0.79 


35 


0.86 


36 


0.85 


37 


0.76 


38 


0-92 


39 


1.08 


40 


2.33 1 


41 


2.83 j 


42 


1.21 


43 


0.91 j 


type 3 




1 


6.88 


2 


1.47 


3 


3.06 


4 


6.52 


5 


10.24 


6 


2.72 


7 


l.ll 


8 


L54 


9 


1.60 


10 


1.21 


11 


1.07 


12 


1.00 


13 


0.85 


14 


0.96 


15 


0.51 


16 


1.00 


17 


1.09 


18 


0.99 


19 


1.04 


20 


1.04 


21 


0.96 
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type 4 



GB48 

GBI13 

GB116 

GB215 

GB358 

GB359 

GB438 

GB516 



22 



0.87 
0.49 
0.68 
0.73 
0.52 
0.56 
0.71 
1.08 
1.04 



type 5 



BE95 



0.36 



Core/El clones of isolates BR36 (type 3a) and BE95 (type 5a) were subsequeatly recombmed 
into the viruses wHCV-62 and wHCV-63, respectively. A genocyped pane! of sera was 
subsequently tested onto cell lysates obtamed from RK13 cells infected with the recombinant 
viruses wHCV-62 and wHCV-63. Tests were carried out as described above and die results 
are given in the table given below (T.\BLE B). From these results, it can clearly be seen that, 
although some cross-reactivity occurs (especially between type 1 and 3). the obtained values 
of a given serum are usually higher on its hotnologous El protein than on an Hi protein of 
another genotype. For type 5 sera, none of the 5 sera were reactive on type 1 or 3 El 
proteins, while 3 out of 5 were shown to contain anti-El antibodies when tested on their 
homologous type 5 protein. Therefore, in this sunple test system, a considerable number ot 
sera can already be serotyped. Combmed with the reactivity to type-specific NS4 epitopes or 
epitopes derived from other type-specific pans of the HCV polyprotein, a serotyping assay 
may be developed for discriminating die major types of HCV. To overcome the problem of 
cross-reactivity, the position of cross-reactive epitopes may be determined by someone skilled 
in the art (e.g. by means of competition of the reactivity with synthetic peptides), and the 
epitopes evoking cross-reactivity may be left out of die composition to be mcluded m the 
serotyping assay or may be included in sample diluent to ouicompete cross-reactive 



antibodies. 
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TABLE B 



1 

1 senim 


El'VWT 


El'VWT 




1 








type lb 








8316 


0.89 


0.59 


0.30 


8358 


2.22 


2.65 


1.96 


9337 


1.59 


0.96 


0.93 


9410 


16.32 


9.60 


3.62 


9413 


9.89 


2.91 


2.85 


10905 


1.04 


0.96 


1.05 


10919 


3.17 


2.56 


2.96 


10928 


4.39 


2.28 


2.07 


10929 


2.95 


2.07 


2.08 


10931 


3.11 


1.49 


2. 1 1 


5 


0.86 


0.86 


0.96 


6 


3.48 


1.32 


1.32 


7 


6.76 


4.00 




8 


10.88 


3.44 


4.0-t 


9 


1.76 


1.88 


1.58 


10 




1 .4o 


7.20 


1 1 

1 1 


0.45 


o c\r\ 


8.-5 


P 


u. /o 


U. i2 


0.76 






:).o / 


5.37 






1U.D4 


11. 22 




J . lo 


1 C~) 

i .O- 


1.65 


type 3 








8332 


3,39 


4.22 


0.66 


10907 


3.24 


4.39 


0.96 


lOQns 

i \JZ7\jQ 


n QQ 




0.98 




u.ao 


0.90 


0.90 






2. / 1 


2.44 


o— 1 U 




0.80 


0.86 




1 .uy 


6.66 


1.17 


o J J i 


1.21 


1.29 


1.22 


30 


0.85 


4.11 


0.98 


32 


0.85 


2.16 


1.04 


type 5 










0.78 


0.95 


1.5-i 


BEllO 


0.79 


1.01 


4.95 


BE95 


0.47 


0.52 


0.65 


BEIII 


0.71 


0.75 


8.33 


BE112 


l.OI 


1.27 


2.37 


BE113 


1.11 


1.35 


1.60 
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Table 5 . Homologies of new HCV sequences with other known HCV types 



Region 
(nucleotides) 


Isolate 

(r>-pe) 


la 
HCV-l 


lb 

HCV-l 


2a 
HC-J6 


2b 
HC-JS 


3a 

Tl T 


3b 

T9 TIO 


Core (lo73) 


PC (5) 


S3 8 (9L6) 


84,3 (92.1) 


82.6 (90 }) 


82.4 (89.0) 






El (57^-957) 


HDiO (3) 
BRJ6 (3) 
BR33 (3) 
PC (5) 
GB358 (4a) 
GB549 (4b) 
GB809 (4c) 


61 i (68 0) 
62.0 (66.4) 
60 ^ (67.2) 

61.4 (64 0) 

62.5 (69 1) 
66 0 (72.2) 
63 3 (69 I) 


64 6 (68.8) 
62.5 (67 2) 

63.3 (68.0) 

62.4 (64 8) 
62.8 (65.9) 
62.8 (69.8) 
60 7 (64.3) 


57.3 (55.5) 
56.5 (53.9) 
56 5 (54 7) 
54 I (49 6) 
59 4 (54 0) 
59.1 (56.4) 
56 7 (53 2) 


56.3 (59 4) 
55.2 (58.6) 
56.0 (58.6) 

53 3 (47.2) 

54 4 (54 0) 
56.5 (54.0) 
53 0 (51 6) 






NS3 

(3856-4209) 


PC {5) 


74 " (89) 


76-1 (86 4) 


76.1 (89 8) 


78 0 (89 0) 






NS4 

(4892-5292) 


BR36 (3) 
HD 10 (3) 


67 S (78.5) 
69 3 (74.6) 


69 8 ("5 1) 
66 6 (69.7) 


62 0 (6" 5) 
57 S (59.9) 


61 " (66 0) 
59 1 (59 9) 






NS4 

(4956-5292) 


PC (5) 


61 3 (62 2) 


63 0 (65 5) 


52 9 (46 2) 


54 3 (43 7) 






NS5b 

(8023-8235) 


BR34 (31 
BR36 (3) 
BR33 (3) 
GB35S (4a) 
GB549 (4b) 
GB809 (4c) 


65 " 

64 3 

65 7 

67 7 (-6.1) 

68 8 (76.1) 
6S.5 (73 5) 


66 7 

67 6 
67 1 

65 6 (77.0) 
67.1 ("7 0) 
65 0 (73.5) 


63 9 

64 8 

64 3 

66 5 (70 S) 

65 9 (71.7) 

67 7 (69 9) 


64 3 

66 " 

64 S 

65 6 (71 7) 
65 9 (74 4) 

67 - (73 5) 


94 S 93 9 
94 8 93 4 
94 8 93 9 


75 b 77.0 
^5 1 76 5 

76 0 77 5 



Shown are the nucleotide homologies (the amino-acid homology is given between brackets) 
for the region mdicated m the left column 
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Table 6. NS4 sequences of the difTerent genotypes 



PCT/EP94/01323 



protocypc 


TYPE 


SYN'lHbliC PEPTIDE NS4.1 


SYNTHETIC PEPTIDE NS^5 
rNS4b) 


SYNTHEllC PEPTIDE NS-^-^ 
CNS4b) 


posiuon- > 




1 1 

6 7 
9 0 
0 0 


1 1 

2 3 
0 0 


1 I 

3 4 

0 0 


HCV-l 


la 


LSG KP.^DREV LYK^FDE 


SQHLPYIEQ G\<MI,-V£QFKQ K 


lxhqfk:^ kalgllqtas rqa 


HCV-J 


lb 


UG RPAVIPDREV LYQEFDE 


.\SHLPYIEQ GM£2L.\EQFKC> ?: 


L.\EQFKQ KALGLLQTAT KQA 


HC-J6 


23 


VNQ R--\WAPDK£V LYE.\FDE 


ASR.\AL[E£ GQRJL\£ML53 K 


LXEMLKS Kir^LLQ^AS KQA 


HC-J8 


2b 


LND RVVVAPDE^EI LYE.\FDE 


ASK^\MJEE GQRM.A£MIJ^S_?: 


M.AEMLKS KIQGLLQ^AT RQA 


BR36 


3a 


LGG KP.MVPDKEV LYQC vqe 


SQA.\P\1EQ AQVL-^HQFKZ K 


L\HQF:<E KrLGLLQR-AT QQ(2 


PC 


5 


LSG K?.\I!PDREA LYQQ FDE 
V 


-j^ASLPYNfPE TRAlAGQFKT K 


UGQFKT KI'LGFISTTG QKA 



\ residues conserved in every genotype. Underlined ammo acids are type-specitic, amino 
acids in italics are unique to type 3 and 5 sequences 
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Table 7 



SEQ ID 


Primer NO 

\ Uu lui i L V / 


Sequence from 5' to 3' 


O J 


HTPr 1 6 1 (~\ 


5 ' - AC C GG AG<jC C AGG AG AGT G ATCTC CT C C - 3 ' 




i 1 i VJ „\ f 


5'-GG<jCTGCTCTATCCTCATCGACGCCATC-3' 




HCPrl63H 


5 ' -GC C AG AGGCTCGG A.AGGC GATC AGC GCT-3 ' 




HCPrl64(-) 


5"-GAGCTGCTCTGTCCTCCTCGACGCCGCA-3" 


O / 


HCPr23(-) 


5'-CTCATGGGOTACATTCCGCT-3' 


AS 

Oo 


HCPr5^(-) 


5'-CTATTACCAGTTCATC ATCATATCCC A-3 ' - 


Qy 


HCPrll6('-) 


5 • -rtrL.A.A.AT AC ATC ATGRC ITGY ATG-3 " 


70 

/ u 


HCProoi-) 


5--c:an2TTGTATCCCRCTGATG.\.ARTTCCACAT-3- 


7 1 


HCPrl 1 S(-) 5'ac:agtcgact2\TGlATICCRCTIATRW.\RTTCCACAT-3 ' 


/ *, 


HCPrl 17(-) 


5 • -ttr_A.A.AT AC ATC C-C IRCITGC ATGC A-3 ' 


/ J 


HCPrl 19(-) 


5'-ac-a2tc2ac-.aRTTIC<:iATIAGCCKRTTCATCCA\TG-3' 




HCPrl31(-) 


5 ■ -a2aartcia2aCCITCITGGGA.YGAR.AYITGGA^ARTG-3 ' 


7^ 


HCPrl 30(-) 


5 " -ssaattc-jis ACIGCITAYC.ARGCLACIGTITGYGC -3 ' 


1 0 


HCPrl3-i(-) 


5 ' -CAT AT AG ATGCCC ACTTC CT ATC -3 ' 


77 


HCPr3(-) 


5"-GTGTGCCAGGACCATC-3" 


TO 


HCPr4(-) 


5 • -GAC ATGC ATGTCATGATGT A-3 ' 


Tn 

/ y 


HCPrl52(» 


5'-TACGCCTCTTCTATATCGGTTGG<}GCCTG-3' 


80 


HCPr52(-) 


5 " -at2TTGGGT.A.AG<}TCATCGATACCCT-3 * 


SI 


HCPr41(-) 


5'-CCCGGGAGGTCTCGTAGACCGTGCA-3' 


82 


HCPr40(-) 


5'-ctana.A.AGATAGAG.A.A.AGAGCA.ACCGGG-3' 


124 


HCPR206 


5'-t222a2tcccatat2at3ccc2ciactttaa-3 ' 


125 


HCPR207 


5 ■ -a2caaaattcct22tcaiaacctccet2aa-3 ' 


141 


HCPR109 


5'-t222atataat2ataaact22tc-3' 


142 


HCPRl-i 


5"-ccaaatacaaccaaaccaan2cc-3 ' 
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II Type 3 NS4 
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CLAIMS 

I A composition comprising or consisting of at least one poiynucleic acid contaming S or 
more contiguous nuciectide:; selected from at least one of the following HCV sequences. 

an HCV type 3 genomic sequence, more paniculariy in any of the following 
regions: 

the region spanning positions 417 to 957 of the Core/El region of HCV subtype 
3a, 

the region spanning positions 4664 to 4730 of the NS3 region of HCV type 3, 
the region spanning positions 4892 to 5292 of the NS3/4 region of HCV type 
3, 

the region spanning positions 8023 to S235 of the NS5 region of HCV subtype 
3a, 

an HCV subwpe 3c genomic sequence, 

an HCV subtype 2d genomic sequence, 

an HCV type 4 genomic sequence, 

the coding region of HCV subtype 5a, 
with said nucleotide numbering being with respect to the numbering of HCV nucleic acids as 
shown in Table 1, and with said poiynucleic acids containing at least one nucleotide difference 
with known HCV poiynucleic acid sequences in the above-indicated regions, or the 
complement thereof 

2. A composition according to claim I, wherein said poiynucleic acids correspond to a 
nucleotide sequence selected from any of the following HCV genomic sequences 

an HCV genomic sequence as having a homology of at least 67%, preferably more than 
69%, most preferably 71% or more to any of the sequences as represented in SEQ ED NO 
13, 15, 17, 19, 21, 23, 25 or 27 in the region spanning positions 417 to 957 of the 
Core/El region; 

an HCV genomic sequence as having a homology of at least 65%, preferably more than 
67%, most preferably 69% or more to any of the sequences as represented in SEQ ID NO 
19, 21, 23, 25 or 27 in the region spanning positions 574 to 957 of the El region, 
an HCV genomic sequence, having a homology of at least 79%, more preferably at least 
81%, most preferably more than 83%) or more to any of the sequences as represented in 
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SEQ ID NO 147 in the region spanning positions I to 378 of the Core region , 
an HCV genomic sequence having a homology of at least 74%, more preferably at least 
76%, most preferably more than ISVa or more to any of the sequences as represented in 
SEQ ID NO 13, if, 17, 19, 21, 23, 25 or 27 m the region spanning positions 417 to 957 
in the Core^l region, 

an HCV genomic sequence having a homology of at least 74%, preferably mere than 
76%, most preferably 78% or more to any of the sequences as represented in SEQ ID NO 
13, 15, 17, 19, 21, 23, 25 or 27 in the region spanning positions 574 to 957 in the El 
region, 

an HCV genomic sequence havmg a homology of more than 73.5%, preferably more than 
74%, most preferably 75% homology to any of the sequence as represented m SEQ ED 
NO 29 in the region spanning positions 4664 to 4730 of the NS3 region, 
an HCV genomic sequence havmg a homology of more than 70%, preferably more than 
72%, most preferably more than 74% homology to any of the sequences as represented 
in SEQ ED NO 29, 31, 33, 35, 37 or 39 in the region spanning positions 4892 to 5292 
in the NS3/NS4 region, 

an HCV genomic sequence having a homology of more than 95%, preferably 95,5^- o, 
most preferably 96% homology to any of the sequences as represented m SEQ ID NO 5, 
7, 1, 3, 9 or 11 in the region spanning positions 8023 to 8235 of the NS5 region, 
an HCV genomic sequence of the BR36 subgroup of HCV type 3a having a homology 
of more than 96%, preferably 96 5%, most preferably 97% homology to any of the 
sequences as represented in SEQ ID NO 5, 7, 1, 3, 9 or II in the region spanning 
positions 8023 to 8192 of the NS5B region, 

an HCV genomic sequence having a homology of more than 79%, more preferably more 
than 81%, and most preferably more than 83% to the sequence as represented m SEQ ED 
NO 149 in the region spanning positions 7932 to 8271 in the NS5B region. 

3. A composition according to claim 1, wherein said polynucleic acids correspond to a 
nucleotide sequence selected from any of the following HCV genomic sequences 

an HCV genomic sequence having a homology of more than 85%, preferably more than 
86%, most preferably more dian 87% homology to any of the sequences as represented 
in SEQ ED NO 41, 43, 45, 47, 49, 51, 53 or 151 in the region spanning positions I to 
573 of the Core region. 
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an HCV genomic sequence having a homolog:/ of more than 61%, preferably more than 
63%, most preferably more than 65% homology to any of the sequences as represented 
in SEQ ED NO 41, 43, 45, 47, 49, 51. 53, 153 or 155 in the region spanning positions 
574 to 957 of the El region, 

an HCV genomic sequence havmg a hcmolog\- of more than 76.5%, preferably of more 
than 77%^ most preferably of more than 78% homoiogy with any of the sequences as 
represented in SEQ ID NO 55, 57, 197 or 199 in the region spanning positions 3856 to 
4209 of the NS3 region; 

an HCV genomic sequence havmg a homology of more than 68%, preferably of more 
than 70%, most preferably of more than 72% homology with the sequence as fepresented 
in SEQ ED NO 157 m the region spanning posinoas 980 to 1 179 of the El/EZ reaion, 
an HCV genomic sequence having a homology- of more than 57^o, preferably more than 
59%, most preferably more than 61% homology/ to any of the sequences as represented 
in SEQ ED NO 59 or 61 m the regioa spannmg positions 4936 to 5296 of the NS-i region, 
an HCV genomic sequence having a homolog^v of more than 93%, preferably more than 
93.5%, most preferably more than 94% homology to any of the sequences as represented 
in SEQ ID NO 159 or 161 in the region spanning positions 7932 to 8271 of the NS5B 
region. 



4 A composition according to claim 1, wherein said polynucleic acids correspond to a 
nucleotide sequence selected from any of the followmg HCV senomic sequences 

an HCV genomic sequence having a homology- of more than 66%, preferably more than 
68%, most preferably more than 70% homology in the El region spanning positions 574 
to 957 to any of the sequences as represented m SEQ ID NO 118, 120 or 122 in the 
region spanning positions 1 to 957 of the CoreEl region, 

an HCV genomic sequence having a homology of more than 71%, preferably more than 

72%, most preferably more than 74% homology to any of the sequences as represented 

in SEQ ED NO 1 18, 120 or 122 in the region spanning positions 379 to 957, 

an HCV genomic sequence having a homology of more than 85%, preferably more than 

86%, most preferably more than 86 5% homology to any of the sequences as represented 

in SEQ ED NO 183, 185 or 187 in the region spanning positions 379 to 957 of the El 

region, 

an HCV genomic sequence having a homology of more than 81%, preferably more than 
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83%, most preferably caore than 85% homology to the sequence as represented m SEQ 

ED NO 189 in the region spanning positions 379 to 957 of the El reeion, 

an HCV genomic sequence having a homology of mere than 85%, preferably mere than 

87°/o, most preferably more than 89^o homology to any of the sequences as represented 

in SEQ CD NO 167 or 169 in the region spanning positions 379 to 957 of the El region, 

an HCV genomic sequence having a homology of more than 79° o, preferably more ±an 

81%, most preferably more than 83% homology to any of the sequences as represented 

in SEQ ED NO 171 or 173 m the region spanning positions 379 to 957 of the El region, 

an HCV genomic sequence having a homology of more than 84%, preferably more than 

86%, most preferably more than 88% homology to the sequence as represented m SEQ 

ED NO 175 in the region spanning positions 379 to 957 of the El resion, 

an HCV genomic sequence havmg a homology of more than 83*^/0, preferably more than 

85%, most preferably more than 87% homology to the sequence as represented in SEQ 

ED NO 177 in the region spanning positions 379 to 957 of the Ei resion , 

an HCV genomic sequence having a homology of more than 76%, preferabiv more than 

78*^ 0, most preferably more than 80% homology to the sequence as represented m SEQ 

ID NO 179 m the region spanning positions 379 to 957 of the El resion, 

an HCV genomic sequence havmg a homology of more than 84%o, preferabiv more than 

So'^/o, most preferably more than 88% homology to the sequence as represented m SEQ 

ED NO 181 in the region spanning positions 379 to 957 of the El resion , 

an HCV genomic sequence havmg a homology of more than 73%, preferably more than 

75%i, most preferably more than 77% homology to any of the sequences as represented 

in SEQ ED NO 106, 108, 1 10, 1 12, 1 14, or 1 16 m the region spanning positions 7932 to 

8271 of the NS5 region ; 

an HCV genomic sequence havmg a homology of more than 88%, preferably more than 
89%, most preferably more than 90% homology to any of the sequences as represented 
in SEQ ED NO 106, iOS, 1 10, or 1 12 in the region spanning positions 7932 to S271 of 
the NS5 region; 

an HCV genomic sequence havmg a homology of more than 88*^%, preferably more than 
89°/o, most preferably more than 90% homology to any of the sequences as represented 
in SEQ ID NO 116 or 201 in the region spanning positions 793 2 to 8271 of die NS5 
region, 

an HCV genomic sequence having a homology of more than 87°/o, preferably more than 
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ED NO 203 in the region spanning positions 7932 to 8271 of the NS5 region, 

an HCV genomic sequence having a homology of more than 85%, preferably more than 

87%, most preferably more than 39% homology to the sequence as represented m SEQ 

ED NO 1 14 in the region spanning positions 7932 to 8271 of the NS5 reaion; 

an HCV genomic sequence having a homology of more than 86%, preferablv more than 

87%, most preferably more than 88% homology to the sequence as represented in SEQ 

ED NO 207 in the region spanning positions 7932 to 8271 of the NS5 reaion, 

an HCV genomic sequence having a homology of more than 84%, preferably more than 

86%, most preferably more than 88% homology to the sequence as represented in SEQ 

ID NO 209 m the region spanning positions 7932 to 8271 of the NS5 region, 

an HCV genomic sequence having a homology of more than 81%, preferablv more than 

83%, most preferably more than 85% homology to the sequence as represented m SEQ 

ED NO 21 1 in the region spanning positions 7932 to 8271 of the NS5 region 

5 A composition according to claim 1, v^herem said polynucleic acids correspond to a 
nucleotide sequence selected from any of the foUowmg HCV genomic sequences 

an HCV genomic sequence having a homology of more than 78%, preferably more than 
80%, most preferably more than 82% homology lo the sequence as represented in SEQ 
ID NO 143 in the region spanning positions 379 to 957 of the Core^l region, 
an HCV genomic sequence having a homology of more than 74%o, preferably more than 
76%), most preferably more than 78% homology to the sequence as represented in SEQ 
ID NO 143 in the region spanning positions 574 to 957, 

an HCV genomic sequence having a homology of more than S7%i, preferably more than 
89%, most preferably more than 91% homology to the sequence as represented m SEQ 
CD NO 145 in the region spanning positions 7932 to 8271 of the NS5B region 

6 A composition according to any of claims I to 5, wherein said polynucleic acid is liable 
to act as a primer for amplifying the nucleic acid of a certain isolate belonging to the genotype 
from which the pruner is derived 

7 A composition according to any of claims 1 to 5, wherein said polynucleic acid is able 
to act as a hybridization probe for specific detection and/or classification into types of a 
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nucleic acid containing said aucleotide sequence, with said oligonucleotide being possibly 
labelled or attached to a solid substrate 

S Use of a composition according to any cf claims I to 7 for m viiro detecting the presence 
of one or more HCV genotypes, more panicularly for detecting the presence of a nucleic acid 
of any cf the HCV genotypes having a nucleotide sequence as defined in any of claims I to 
5, present in a biological sample liable to contain them, comprising at least the following 
steps* 

(i) possibly extracting sample nucleic acid, 

(ii) possibly amplifying the nucleic acid with at least one of the prmiers according to 
claim 6 or any other HCV type 2. HCV type 3, HCV type 4^ HCV type 5 or 
universal HCV primer, 

(iii) hybridizing the nucleic acids of the biological sample, possibly under denatured 
conditions, and with said nucleic acids being possibly labelled durmg or after 
amplification, at appropriate condiuons with one or more probes accordmg to claim 
7, with said probes being preferably attached to a solid substrate, 

(iv) washing at appropriate conditions, 

(v) detecting the hybrids formed, 

(vi) inferring the presence of one or more HCV genot\'pes present from the observed 
hybridization pattern 

9 A composition consisting of or comprising at least one peptide or polypeptide containing 
in its sequence a condguous sequence of at least 5 amino acids cf an HCV polyprotein 
encoded by any of the polynucleic acids according to any of claims 1 to 5. 

10 A composition according to claim 9, wherein said contiguous sequence contains in its 
sequence at least one of the following amino acid residues' 

L7, Q43, M44, S60, R67, Q70. T71, A79. A87, N106, K115. A127, A190, S130. V134, 
G142, 1144, E152. A157, V15S, P165, S177 or Y177, 1178, VlSO or E180 or F182, R184, 
1186. H187, T189, A190, S191 or G191, Q192 or L192 or n92 or V192 or E192, N193 or 
H193 or P193, VV194 or Y194, H195, A197 or 097 or V197 or T197, V202, 1203 or L203, 
Q208. A210. V212, F214, T216, R217 or D217 or £217 or V217, H218 or N21S, H219 or 
V219 or L219, L227 or 1227, M23 1 orE231 or Q231, T232 or D232 or A232 or K232, Q23 5 
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or 1235, A237 or T237, 1242, 1246, S247, S248, V249, S250 or Y250, 125 1 or V25 1 or M25 1 
Of F251, D252, T254 or V254, L255 or V255, E256 or A256, M258 or F25S or V258, A260 
or Q260 or S260, A261, 126*^ or Y264, M265, 1266 or .\266, A267, G268 or T268, F271 or 
M271 or V271, 1277, M280 or H230, 1234 or .\2S4 or LS4. V274, V291, N292 or 5292, 
R293 or 1293 or Y293, Q294 or R294, L297 or 1297 or Q297, A299 or K299 or Q299, N303 
or T303, T30S or L30S, T310 or F310 or .^310 or 03 10 or V310, L313, G3I7 or Q317, 
L333, S351, A358, .^59, A363, S364, A366, T369, L373, F376, Q386, 1387, S392, 1399, 
F402, 1403, R405, D454, A461, A463, T464, K484, Q500, E501, S521, K522, H524, N528, 
S531, S532, V534, F536, F537, M539, 1546, C1282, A1283, H1310, V1312, Q1321, P1368, 
V1372, V1373, K1405, Q1406, S1409, A1424, A1429, C1435, S1436, S 1456, H1496, A1504, 
D1510, D1529, 11543, NI567, D1556, NI567, MI572, Q1579, L1581, S15S3, F1585; V1595, 
El 606 or T1606, M16il, V1612 or L1612, P1630, CI 636, P1651, 11656 or 11656, L1663, 
Vi667, V1677, A16S1, H1685, E1687, G1689, V1695, A1700, Q1704, Y1705, AI713, A1714 
or S1714, M1718, DI719, A1721 or T172I, R1722, A1723 or V1723, H1726 or G1726. 
E1730, Vi732, F1735, 11736, S1737, R1738, T1739, G1740, Q1741, K1742, Q1743, A1744, 
T1745, L1746, E1747 or K1747, 11749, A1750, T1751 or A1751, V1753, N1755, K1756, 
A1757, P1758, A1759, H1762, T1763, Y1764, P2645, A2647, K2650, K2653 or L2653, 
S2664, N2673, F2680, K268U L2686, F12692, Q2695 or L2695 or 12695, V2712, F2715, 
V2719 or Q2719, T2722, T2724, S2725. R2726, G2729, Y2735, H2739, 12748, G2746 or 
12746, 12743, P2752 or K2752, P2754 or T2754, T2757 or P2757, 

with said notation being composed of a leaer representing the ammo acid residue by its one- 
lener code, and a number representmg the amino acid numbering according to Kato et al., 
1990 as shown in Table 1 

11 A composition according to any of claims 9 or 10, wherein said contiguous sequence 
is selected from any of the following HCV amino acid sequences 

- a sequence having a homology of more than 72%, preferably more than 74%, and most 
preferably more than 77% homology to any of the amino acid sequences as represented in 
SEQ ID NO 14, 16, 18, 20, 22, 24, 26 or 28 in the region spanning positions 140 to 319 
m the Core/El region, 

- a sequence having a homology of more than 70%, preferably more than 72%, and most 
preferably more than 75% homology to any of the amino acid sequences as represented in 
SEQ ED NO 14. 16, 18, 20, 22, 24, 26 or 28 in the El region spanrung positions 192 to 
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319, 

- a sequence having a homology of more than 86%, preferably more than 88%, and mosi 
preferably more than 90% homology to the amino acid sequences as represented in SEQ 
ID NO 148 in the region spanning positions 1 to 1 10 in the Core reaion, 

- a sequence havmg a homology of more than 76%, preferably more than 78%. most 
preferably more than 80°/o to any of the amLno acid sequences as represented in SEQ ID 
NO 30, 32, 34, 36, 38 or 40 in die region spanning positions 1646 to 1764 m the NS3/NS4 
region, 

- a sequence having a homology of more than 81.5%, preferably more than 83%, and most 
preferably more than 86% homology to any of the amino acid sequences as represented m 
SEQ ID NO 14, 16, 18, 20, 22, 24, 26 or 28 m the EI region spanning positions 192 to 
319, 

- a sequence having a homology of more than 86%, preferably more than 83%, mosi 
preferably more than 90°''o to the amino acid sequence as represented m SEQ ED NO 150 
in the region spanning posiuons 2645 to 2757 in the NS5B resion, 

12. A composition according to any of claims 9 or 10, wherein said coniieuous sequence 
is selected from any of the following HCV amino acid sequences' 

- a sequence having a homology of more than 80%, preferably more than 82°'o. most 
preferably more than 84% homology to any of the amino acid sequences as represented m 
SEQ ID NO 1 18, 120, and 122 in the region spanning positions 127 to 319, 

- a sequence having a homology of more than 73%, preferably more than 75°o. most 
preferably more than 78*^/0 homology in the El region spanning positions 192 to 3 19 to any 
of the amino acid sequences as represented m SEQ ID NO 1 IS, 120, and 122, in the resion 
spanning positions 127 to 319, 

- a sequence having more than 85%o, preferably more than 86%, most preferably more than 
87°/ 0 homology to any of the amino acid sequences as represented m SEQ ID NO 118, 120 
or 122, in the region spanning positions 192 to 319 

13 A composition according to any of claims 9 or 10, wherem said contieuous sequence 
is selected from any of the following HCV ammo acid sequences 

- a sequence having more than 93%), preferably more than 94%, most preferably more than 
95°'o homology in the region spanning Core positions 1 to 191 to any of the ammo acid 
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sequences as represented in SEQ ID NO 42, 44, 46, 48, 50, 52, 54, or 152; 

- a sequence having more than 73%. preferably more than 74%, most preferably more than 
76% homology in the region spanning El positions 192 to 319 to any of the ammo acid 
sequences as represented in SEQ ID NO 42, 44, 46, 48, 50, 52, 54, 154 or 15d' 

- a sequence spanning positions I2S6 to 1403 of the NS3 region, with said sequence being 
characterized as having more than 90%, preferably more than 91%, most preferably more 
than 92% homology to any of the amino acid sequences represented in SEQ ED NO 56 to 
58; 

- a sequence spanning positions 1646 to 1764 of the NS3/4 region, with said sequence being 
characterized as having more than 66%, more particularly 63%, most panicularly 70% or 
more homology to any of the amino acid sequences as represented m SEQ ED NO 60 or 
62 



14 A composition accordmg to any of claims 9 to 10, wherein said contiguous sequence is 
selected from any of the following HCV amino acid sequences 

- a sequence having a more than 83%, preferably more than 85%, most preferably more 
than 87% homology in the region spanning Core positions 1 to 319 to the ammo acid 
sequence as represented m SEQ ID NO 144, 

- a sequence having a more than 79%, preferably more than 81%, most preferably more 
than 84% homology in the region spanning El positions 192 to 3 19 to the amino acid 
sequence as represented in SEQ ID NO 144; 

- a sequence having more than 95%, more panicularly 96%, most particularly 97% or more 
homology to the amino acid sequence as represented in SEQ ID NO 146, in the region 
spanning positions 2645 to 2757 of the NS5B region 

15. A composition according to any of claims 9 to 14, wherein said sequence is selected from 
the following peptides: 

QPTGRSWGQ (SEQ ID NO 93) 

RSEGRTSWAQ (SEQ ID NO 220) 

RTEGRTSWAQ (SEQ ED NO 221) 

SRRQPIPR.\RRTEGRSWAQ (SEQ ED NO 268) 

LEWRNTSGL\'\T (SEQ ED NO 83) 

VNYRNASGIYHI (SEQ ED NO 126) 
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QHYRNISGIYHV' (SEQ ED NO 127) 
EHYRNASGIYHI (SEQ ID NO 128) 
IHYRNASGIYHI (SEQ ID NO 224) 
V'PYRNASGri'H\' (SEQ ED NO S4) 
VNYRNASGI^in (SEQ DD NO 225) 
VN^-RNASGVYHI (SEQ ED NO 226) 
Vm-HNTSGIYHL (SEQ ID NO 227) 
QHYRNASGPrTTV (SEQ ID NO 223) 
QITt-RNVSGIYHV (SEQ ID NO 229) 
nrtTlNASDGYYl (SEQ ID NO 230) 
LQVKNTSSSYM^' (SEQ ID NO 231) 
VYE.\DDVILHT (SEQ ID NO 85) 
VTETEHHILHL (SEQ ED NO 129) 
VYE.\DHHIMHL (SEQ ED NO 130) 
VYETDHHTLHL (SEQ ID NO 131) 
V^-E.-^DNXELa-i, (SEQ ID NO 86) 
VWQLILArVLHV (SEQ ID NO 232) 
V\-E.ADYHILHL (SEQ ID NO 233) 
VTETDN'HELHL (SEQ ID NO 234) 
VYETENinLHL (SEQ DD NO 235) 
VTETVHHILHL (SEQ ED NO 236) 
VFETEHHILHL (SEQ ED NO 237) 
VFETDHHEMHL (SEQ ID NO 238) 
VYETENHELHL (SEQ ED NO 239) 
VYE.\D.\LELR-\ (SEQ ID NO 240) 
VQDGNTSTCWTPV (SEQ ID NO 87) 
VQDGNTSACWTPV (SEQ ID NO 241) 
VRVGNQSRCWV.\L (SEQ ED NO 132) 
VRTGNTSRCVVWL (SEQ ED NO 133) 
VRAGNVSRCWTPV (SEQ ED NO 134) 
EEKGNISRCWIPV (SEQ ID NO 242) 
VKTGNQSRCWVAL (SEQ ID NO 243) 
VRTGNQSRCWVAL (SEQ ID NO 244) 
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VKTGNQSRCWI.-VL (SEQ ED NO 245) 
VKTGNVSRCWIPL (SEQ ID NO 247) 
VKTGNVSRCWTSL (SEQ ED NO 248) 
V'RKDWSRCW'v-QI (SEQ ID NO 249) 
\'R\'VGA7TAS (SEQ ID NO 89) 
APYTG.APLES (SEQ ID NO 135) 
APYV'GAPLES (SEQ ID NO 136) 
AVSMD.^LES (SEQ ID NO 137) 
APSLGAVTAP (SEQ CD NO 90) 
.APSFGAVT.AP (SEQ DD NO 250) 
VSQPGALTKG (SEQ ID NO 251) 
\TCY\-GATTAS (SEQ ED NO 252) 
APYIGAPVES (SEQ ED NO 253) 
AQHLNAPLES (SEQ ED NO 254) 
SPYVGAPLEP (SEQ ED NO 255) 
SPYAGAPLEP (SEQ ID NO 256) 
APYLG.APLEP (SEQ ED NO 257) 
APYLG.APLES (SEQ ED NO 25S) 
AP"WGiJPLES (SEQ ED NO 259) 
VP\-LG.\PLTS (SEQ ED NO 260) 
APHLR.APLSS (SEQ ID NO 261) 
APYLG.APLTS (SEQ ID NO 262) 
RPRRHQTVQT (SEQ ID NO 91) 
QPRRHWTTQD (SEQ ED NO 138) 
RPRRHAVTTQD (SEQ ID NO 139) 
RPRQHATVQN (SEQ ID NO 92) 
RPRQHATVQD (SEQ ID NO 263) 
SPQHHKFVQD (SEQ ID NO 264) 
RPRRLWTTQE (SEQ ED NO 265) 
PPRIHETTQD (SEQ ID NO 266) 
TISYANGSGPSDDK (SEQ DD NO 267) 

16 Recombinant vecior, panicularly for cloning and/or expression, with said recombinant 
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vector comprising a vector sequence, an appropriate prokaryotic, eukaryotic or viral promoter 
sequence followed by the nucleotide sequences as defmed in claims 1 to 5, with said 
recombmant vector allowing the expression of any one of the KCV type 2 and or HCV type 
3 and/or rype 4 and'or type 5 derived polypeptides according to any of claims 9 to 15 in a 
prokaryotic, or eukaryotic host, or in living mammals when injected as naked DN.\, and more 
particulariy a recombinant vector allowing the expression of any of the following HCV type 
2, HCV type 3, type 4 or type 5 polypeptides spaaning the following amino acid positions 

- a polypeptide staning at position I and ending at any position m the region between 
positions 70 and 326, more particularly a polypeptide spanning positions 1 to 70, 1 to 85, 
positions 1 to 120, positions 1 to 150, positions 1 to 191, positions 1 to 200, for" express ion 
of the Core protein, and positions 1 to 263, positions 1 to 326, for expression of the Core 
and El protein, 

- a polypeptide staning at any position in the region between positions 117 and 192, and 
ending at any position in the region benveen positions 263 and 326, more particularly from 
positions 119 to 326, for expression of El. or forms that have the putative membrane 
anchor deleted (positions 264 to 293 plus or mmus 3 ammo acids); 

- a polypeptide starting at any position in the region between positions 1556 and 163S, and 
ending at any position in the region bervveen positions 1739 and 1764, for expression of 
the NS4 regions, more particularly a polypeptide staning at position 1658 and endmg at 
position 1711 for expression of the NS4a antigen, and more panicularly, a polypeptide 
staning at position 1712 and ending between positions 1743 and 1972, for example 1712- 
1743, 1712-1764, 1712-1782, 1712-1972, 1712 to 1782 and 1902 to 1972 for expression 
of the NS4b protein or parts thereof 

17. A composition according to any of claims 9 to 15, wherein said polypeptide is a 
recombinant polypeptide expressed by means of an expression vector as defmed m claim 16 

18 A composition according to any of claims 9 to 15 or 16, for use in a method for 
immunizing a mammal, preferably humans, against HCV comprising administrate ring a 
sufficient amount of the composition possibly accompanied by pharmaceutically acceptable 
adjuvants, to produce an immune response, more particularly a vaccine composition including 
HCV type 3 polypeptides derived from the El, Core, or NS4 region andy'or t>'pe 4 and'or type 
5 and/or type 2 polypeptides 
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19 Antibody raised upon immunization v/ith a composiuon according to any of claims 9 to 
15, 17 or 18, by means of a process according to claim 18, with said antibody being reactive 
with any of the polypeptides as defmed in any of claims 9 to 15, 17 or 18 

20 Process for detecung In vitro KCV present in bioicgical sample liable to contain it, 
comprising at least the following steps: 

(i) contacting the biological sample to be analyzed for the presence of HC V antibodies 
with any of the compositions according to claims 9 to 15, 17 or 18, preferentially 
in an immobilized form under appropriate coaditions which allow the formation 
of an immune complex, wherein said polypeptide is preferentially m Ihe form of 
a biotinylated polypeptide and is covalently bound to a solid substrate by means 
of streptavidin or avidin complexes, 

(ii) removing unbound components, 

(iii) incubating the immunecomplexes formed with heterologous antibodies, which 
specifically bind to the antibodies present in the sample to be analyzed, with said 
heterologous antibodies having conjugated to a detectable label under appropriate 
conditions, 

(iv) detecting the presence of said immunecomplexes visually or by means of 
densitomeu^' and inferring the HCV serotype(s) present from the observed 
hybridization paciem. 

21. Use of a composition according to any of claims 9 to 15, 17 or 18, for incorporation into 
a serotyping assay for detecting one or more serological types of HCV present in a biological 
sample liable to contain it, more particularly for detecting El and NS4 antigens or antibodies 
of the different types to be detected combined in one assay format, comprising at least the 
following steps. 

(i) contacting the biological sample to be analyzed for the presence of HCV antibodies 
or antigens of one or more serological types, with at least one of the compositions 
according to clauns 9 to 15, 17 or IS in an immobilized form under appropriate 
conditions which allow the formation of an immunecomplex, (wherein said 
polypeptide is preferentially in the form of a biotinylated polypeptide and is 
covalently bound to a solid substrate by means of streptavidin or avidin 
complexes), 
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(ii) removing unbound components, 

(iii) incubaung the immunecomplexes formed with heterologous antibodies, which 
specifically bind to the antibodies present in the sample to be analyzed, with said 
heterologous antibodies having conjugated to a detectable label under appropriate 
conditions, 

(iv) detecting the presence of said immunecomplexes visually or by means of 
densitometry and mfemng the HCV serological types present from the observed 
binding pattern 

22 A kit for determining the presence of HCV genotypes as defmed m any of claims 1 to 5 
present in a biological sample liable to contain them, comprising 

- possibly at least one primer composition containing any primer selected from those 
defmed m claim 6 or any other HCV type 2 and/or HCV t\'pe 3 andy'or HCV v/pc 4 
and/or HC\' type 5, or universal HCV primers, 

- at least one probe composition according to claim 7, preferably in combination with 
other polypeptides or peptides from HCV type 1, type 2 or other types of HCV, with 
said probes being preferentially immobilized on a solid substrate, and more 
preferentially on one and the same membrane strip, 

- a buffer or components necessary/ for producing the buffer enabling hybridization 
reaction berA*een these probes and the possibly amplified products to be carried out, 

- a means for detecting the hybrids resulting from the preceding hybriziation, 

- possibly also including an automated scanning and interpretation device for infering 
the HCV aenor>-pe(s) present ia the sample from the observ^ed hybridization pattern 

23 A kit for determining the presence of HCV antibodies according to any of claims 9 to 15, 
17 or 18 present in a biological sample liable to contain them, comprismg' 

- at least one polypeptide composition according to any of claims 9 to 15, 17 or 18, 
with said polypeptides being preferentially immobilized on a solid substrate, and more 
preferentially on one and the same membrane strip, 

- a buffer or components necessary for producing the buffer enabling bmdlng reaction 
between these polypeptides and the antibodies against HCV present in the biological 
sample, 

- a means for detectmg the immune coraolexes formed m the precedmg bindmg 
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reaction, 

- possibly also Lacluding an automated scanning and interpretation device for infering 
the HCV genotype present in the sample from the observed binding pattern. 
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RULE SZ iZ7 C..-.R. 
DECLARATION AND ?OWcR OF ATTORNEY 
FOR PATENT APPLJCATICN 
IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



COPY 



As a =etov; r.ar^ea inventsr : nersay dec:ar- tfiat -ny r-s.aence =os: arfice accress anc c::i22nsn.o are as stated betow next to .-^v name ana ! 
beiieve ; am :Me ang.nat Srsi arc sole .nventor (if znv^ 3ne name .s :.stea aetowi or an anginal, nrst and ;oint inventor (if zlur^t names are l.siea 
Deio'- - If "rte suoiec: .-natter '.vmcn is claimed anc .'or wmcr. a patent ;s sougnt on :r>e invennon antitlec* "VPL/ cr-nTrr-MrTr nc cr-n ^-^ 

£_ GEXOr^PES AND THEIR USE AS THERAPELTIC A>m DIAGNOSTIC AC:K^S" " "'^ 

tne soecincancn zr wnicn (cftecx aooiicaoie Doxts); 
I IS actacned nereto 

j was filec on as 'j 3. Acoiicatian 5enal Ma. (Atty Okt. No. 



X 1 was filed as PCT intemanonai acoucacon Nc PCT £?Qil > r,: 370 ' 27 APRIL 199^ 

3 n d {il aooticaoie to U.S ar ?CT aoolicacon) was amencea — " 



I hereoy staie 'hat I nave reviewed and understand tne cant-- :f -o/a --ptrred scac:ficaDon including the claims, as amended =y any 
amendment reren-ed :o aoove. 1 acknow.ecge Tie cur^ :o d.-cose -—auon ^nicn .s matenal to the oatentaoilrty a f this aooucanon tn accoraance 
wrth 3/ C- R : zo ! hereoy ciaim foreign sncrrr/ =enerrts jna^r :5 . S C ' -,355 ar any cre-gn aooncationts) for patent or inventors caraiicate 
listed oeiow ana nave aiso laentined aeiow anv foreign aooncatron Zai^ni z: .nventars -r^:nca(e navmg a filing date oerore -nat or fne acoiicaiion 
on wnicn Docmry is claimed or. :f no pnonty :s ciaimed. Defcre tne "iling ::aie or tr.s 3coiic3t:cn 
Pnpr Foreign AoDlicanon(sr 

A?;piication Numoer Country Day/MomnTf ear Filed 

93.401099.2 EUROFH 27 APRII^ 1993 

93.402.019.9 E'JRCPE 05 AUGUST 1993 



I hereoy ciaim ;ne oenefit under 35 'J.5.C ^ZZfZ^z of aii snor 'Jnr.ec States and PCT inteTiational aoDticanons listed aoove or peiow and. insofar as 
tMe suoiec: .-natter of aacr. of the claims of this aociication :3 net oisc:osed .n sucn oner acsncaaons in the .nanner oravtced oy the first oaragraon 3f 
35 'J.3.C. :'Z. \ acknowledge 'he cuty :o dtscjose -naterrai inrormacon as Penned 'n 37 Z ? ^5o wnicn occurred oetween the filing oate of the 
pnor apoiicanons arc the nauonai or ='CT tntsmationai ilmg date of ths asoucaDon 

Status: patented, 

Day/Month/Year Filed pcndmq, aoandoned 



Pnor U.SjTCT Applicaiiontsl: 
Application Senal No. 



PCT/EP94/01323 



27 APRIL 1994 



PENDIMG 



! nereov cedare :hat ail statements made Terein of my own ^nowiecge are rue and Tiat ati statements maae on rnfomiation anc oener are aeiievec 
:a oe Tue. and further thai these statements «ers made vrtn a'.e <nowtedge tlrat wiilfui *a)se statements and the iike so mace are ounisnaoie ov Ine 
or imansonment. or ootn, uncer Secrian of ""tie 19 of the 'Jnttec States Cooe and *J!at sucn wiitfui false satements may eooarcize Tie vandtty 
of the apoticanen or anv oaten: ssued thereon. And : nereov aocoint NIXON 1 VANDERHYH ?.C« 1100 North GJeoe Rd-, 3th Floor, Arlington, 
VA 2220t-t714. teleonone number (703) 3TS-X0Q0 fto wnom aU oommunicatXKis are to be directed), and Sie following actomevs thereof tcf *he 
same address) incivtdualiy and ^oilacr^/eiv mv atromevs to arosects tib acoucation anc "o transact all business ;n the ^atent anc Trademarx Office 
connected therewith and wrth 'he resulting oaten:: Amur Zrawrors. ;53Z7- jrrv 3. Naon. 25540: Rooe/t A. Vancemve. Z7075. -ames 
Hosmer. 3013^' PvOoert W, Fans, 31352, =^icharc Z, 3esna. II77C: Mane £. Nusoaum. 323^6: .Micnael J. Keenan, 321C6; 3ryan H Davidson. 3C251. 
Stanley C. Sooonsr 17393: Leonard C Mitcharc. 330Q3: ^l^uane ,M 3vers. 332S2: =aul - .-enon. 336t5, Jeffr/ :h. .Meison. 30^61, Jonn R Lastova, 
331 -i9: H. ^A^rran 3umam. Jr. 33365. Thomas H. 3yme. 322CS. Man^ j. 'vUison. 32955. - 3cctt Oavicson. 33^59. 



inventor's Signature. 
Inventor 



.^estcer^ca 
=>os: 3rftc2 



icrr/) 
A c cress; 



Inventors ijignature. 
inventor 

?^ssiGenca ;cr?/) 
Post Cff:c3 -iccress 
.Z"o Code) 



Date: 



GeertC 

(first) 

3-8310 BRUGC 



:iAERTEN5 
(last! 

istate.'countrv) 



BELGIAN 
(crtcensnio) 



^F"r;TTi?f 



(first) <I_.----^^^[;!^^'''^^ J ('3S£: 



Date, 



BELGIAN 

(c^snsniDi 



(stats/countr/) 



BELGIUM 



3-93^0 le::e. belgiu:-! 



:nven 
nven 



3r3 oignatun 



33: 3f^c3 -ccress 



[last: 

;s;3:a. country; 
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SZQUSUCZ LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NA:'1E: I— 29sr.-t:ics sa. ^ . k . 

(B) STREET: Incus criLepark Zwiinaarde 7, box 4 
{C) CITY: Ghe::- 

(El COUNTRY: Belgium 

(F) POSTAL CZLZ (ZI?) : 3-9QS2 

(G) TELEPHONE: 00 32 9 241 07 11 

(H) TELEFAX: 00 32 9 241 07 99 

Ui) TITLE Or INVENTION: New seq*^ences of hepatitis C virus genotypes 
for d-agncsis, prophylax-S and therapy. 

(iii) NUMBER 0? SEQUENCES: 270 

(iv) COMPUTER READABLE FORM. 

(A) MEDIL'M T'iPE: Floppy disk 
(3) COM-UTER: I3M PC ccrupatible 

(C) OPERATING SYSTEM: PC - DCS /MS -DOS 

(D) SCFTWAP.E: Patent- Release itl.O, Version ^-.25 (E.C) 

(2) INFORMATION FOR SEC ID NO : 1: 

(i) SEQUE^ICE CHARACTERISTICS: 

(A) LENGTH: 213 base pairs 

(B) T'lTPE : nucleic acid 

(C) STR-^rDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECJLE T-/PE: cDNA 
(lii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(vii) IMMEDIATE SOLaCE : 

(B) CLONE: B?.34-4-20 

(ix) FEATURE: 

(A) n;^mf:/key: cds 

(3) LOCATION: 1,.213 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1: 

r-r car CC-^ PTG TTC AAC AGO AAG GGG 
nr*f^ rr^i. C^-G TAC iG^ GvjU x j. w ^-^^ 

Z: S t^S TV. Cv-s Cl,.Cly pro Mac P.e ..n Ser .ys Oly 



1 



r-- CGC TGC CGT GCC AGT GGA GTT CTG CCT ACC 
GCC CAG TGT GG. C.. C.. T.. CC ^^^^ 

Ala Gin Cys Gly T^/r Arg Arg Cvs A.c A^a se. 



48 
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20 25 30 

AGC TTC GGC AAC ACA ATC ACT TGC TAG ATC AAG GCC ACA GCG GCT GCA 144 
Ser Phe Gly Asn Thr lie Thr Cys Tyr He Lys Ala Thr Ala Ala Ala 
35 40 45 

AGG GCC GCA GGC CTC CGG AAC CCG GAC TTT CTT GTC TGC GGA GAT GAT 192 
Arg Ala Ala Gly Leu Arg Asn Pro Asp Ph2 Leu Val Cys Gly A=p Asp 
50 55 60 

CTG GTC GTG GTG GCT GAG AGT 213 
Leu Val Val Val Ala Glu Ser 
65 70 



(2) INFORMATION FOR SEQ ID NO : 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 71 ammo acids 

(B) TYPE: amirxO acid 
(D) TOPOLOGY; linear 

{11} MOLECULE TYPE: procein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Leu Thr Glu Arg Leu Tyr Cys Gly Gly Pre Me:: Phe Asn. Ser Lys Gly 
15 10 15 

Ala Gin Cys Gly Tyr Arg Arg C/s Arg Ala Ser Gly Val Leu Pro Thr 
20 25 30 

Ser Phe Gly Asn Thr He Thr Cys T'/r He Lys Ala Thr Ala Ala Ala 
35 40 45 

Arg Ala Ala Gly Leu Arg Asn Pro Asp Pne Leu Val Cys Gly Asp Asp 
50 55 60 

Leu Val Val Val Ala Glu Ser 
65 70 

(2) INFORMATION FOR SEQ ID NO: 3: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 213 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

{11) MOLECULE TYPE: CDNA 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: BR36-23-19 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1 . .213 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CTC ACG GAA CGG CTT TAC TGC GGG GGC CCT ATG TTC AAC AGC AAG GGG 4 8 

Leu Thr Glu Arg Leu Tyr Cys Gly Gly Pro Met ?he Asn Ser Lys Gly 
15 10 IS 

GCC CAG TGT GGT TAT CGC CGC TGC CGT GCC AGT GGA GTT CTG CCT ACC 96 
Ala Gin Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Pro Thr 
20 25 30 

AGC TTC GGC AAC ACA ATC ACT TGC TAC ATC AAG GCC ACA GCG GCT GCA 144 
Ser Phe Gly Asn Thr lie Thr Cys Tyr lie Lys Ala Thr Ala Ala Ala 
35 40 45 

AGG GCC GCA GGC CTC CGG AAC CCG GAC TTT CTT GTC TGC GGA GAT GAT 192 
Arg Ala Ala Gly Leu Arg Asn Pro Asp Phe Leu Val Cys Gly Asp Asp 
50 55 SO 

CTG GTC GTG GTG GCT GAG AGT 213 
Leu Val Val Val Ala Glu Ser 
65 70 



(2) INFORMATION FOR SEQ ID NC : 4: 

(i) SEQUENCE CHAPJ^CTERISTICS : 
(A) LENGTH: 71 am.nc; acids 
(3) TYPE: amz-no acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: prcrem 

(xi) SEQUENCE DESCHIPTICN. SEQ ID NO: 4: 

Leu Thr Glu Arg Leu Tyr Cys Gly Gly Pro Met Phe Asn Ser Lys Gly 
15 10 15 

Ala Gin Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Pro Thr 
20 25 30 

Ser Phe Gly Asn Thr lie Thr Cys Tyr lie Lys Ala Thr Ala Ala Ala 
35 40 45 

Arg Ala Ala Gly Leu Arg Asn Pro Asp Phe Leu Val Cys Gly Asp Asp 
50 55 60 

Leu Val Val Val Ala Glu Ser 
65 70 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHAEACTFRZSTICS: 

(A) LENGTH: 213 base pairs 
(3) TYPE: nucleic acid 

(C) STRA^TDEDNESS ; single 

(D) TOPOLOGY: linear 
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(ix) MOLECULi TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SEiJSZ: NO 



(vii) IMMEDIATE SOTOCE; 

(B) CLONE: 3R36-23-ia 

(ix) FEATUllE: 

(A) NAI^/KEY; CDS 

(B) LOCATION: 1..2I3 



(Xi) SEQUENCE DESC^IIPTION: SEQ ID NO: 5: 

CTC ACG GAG CGG CTT TAC TGC GC-G GGC CCT ATG TTT AAC AGC AAG GGG 4 8 

Leu Thr Glu Arg Leu Tyr Cys Gly Giy Pro Met Phe Asr. Ser Lys Gly 
15 10 15 

GCC CAG TGT GGT TAT CGC CGT TGI CGT GCC AGT GGA GTT CTG CCT ACC 96 
Ala Gin Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Pro Thr 
20 25 30 

AGC TTC GGC AAC ACA ATC ACT TGT TAC ATI AAA GCC AO. GCG GCC GCA 144 
Ser Phe Gly Asn Thr He Thr Cys T;/r lie Lys Ala Thr Ala Ala Ala 
35 40 45 

AAA GCC GCA GGC CTC CGG AGC CCG GAC TTT CTT GTC TGC GGA GAT GAT 19 2 

Lys Ala Ala Gly Leu Arg Ser Pro Asp Phe Leu Val Cys Giy A^p Asp 

50 55 60 

CTG GTC GTG GTG GCT GAG AGT 213 
Leu Val Val Val Ala Glu Ser 
65 70 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 71 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(li) MOLECULE I"^?E : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Leu Thr Glu Arg Leu Tyr Cys Gly Gly Pro Met Phe Asn Ser Lys Gly 
15 10 15 

Ala Gin Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Pro Thr 
2 0 "2 5 3 0 

Ser Phe Gly Asn Thr lie Thr Cys Tyr He Lys Ala Thr Ala Ala Ala 
35 40 45 
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Lys Ala Ala Gly Leu Arg Ser Pro Asp Phe Leu Val Cys Gly Asp Asd 
50 55 60 

Leu Val Val Val Ala Glu Ser 
65 70 

(2) INFORMATION FOR SEQ ID NO : 7: 

(1) SECuz::c3 characteristics: 

(A) LENGTH: 213 base pairs 

(3) TTPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ll) MOLECULE TYPE: CDNA 
(lil) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: BR36-23-2C 



CDS 

1. .213 



(xi) SEQL'^'CE DESCRIPTION: SEQ ID NO: 7: 



CTC ACG GAG C'GG CTT TAC TGC GGG GGC CCT ATG TTT AA.C AGC AAA GGG 4 3 

Leu Thr Glu Arg Leu Tyr Cys Gly Gly Pro Met Phe Asn Ser Lys Gly 
15 10 15 

GCC CAG TGT GGT TAT CGC CGT TGC CGT GCC AGT GGA GTT CTG CCT ACC 95 
Ala Gin Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Pro Thr 
20 25 30 

AGC TTC GGC AAC ACA ATC ACT TGT TAC ATC AAA GCC ACA GCG GCC GCJK 144 
Ser Phe Gly Asn Thr lie Thr Cys Tyr lie Lys Ala Thr Ala Ala Ala 
35 40 45 

AAA GCC GCA. GGC CTC CGG AGC CCG GAC TTT CTT GTC TGC GGA GAT GAT 192 
Lys Ala Ala Gly Leu Arg Ser Pro As? Phe Leu Val Cys Gly Asp Asp 
50 55 60 

CTG GTC GTG GTG GCT GAG AGT 213 
Leu Val Val Val Ala Glu Ser 
65 70 

(2) INFORMATION FOR SEQ ID NO: B: 

(i) SSQUEliCZ CHARACTERISTICS: 

(A) LENGTH: 71 amino acids 

(B) TYPE: amino acid 



(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NA*<E/:<EY: 
{3} LOCATION: 
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(ii) MOLECULE TYPE: pro tain 

(xi) SEQUENCE DESCRIPTXCN: SEQ ID JIO: 8: 

Leu Thr Glu Arg Le'-: Tyr Cys Gly Gly Pro Met Phe Asr. Ser Lys Gly 
15 10 Is 

Ala Gin Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Pro Thr 
20 25 30 

Ser Phe Gly Asn Thr He Thr Cys Tyr lie Lys Ala Thr Ala Ala Ala 
35 40 45 

Lys Ala Ala Gly Leu Arg Ser Pre Asp ?hs Leu Val Cys Gly Asp Asp 
50 55 60 

Leu Val Val Val Ala Glu Ser 
65 70 

(2) INFORMATION FOR SEQ ID NC : 5. 

(i) SEQUENCE CKASACTEP.ISriCS: 

(A) LENGTH: 213 base pairs 
(3) TYPE: nucleic acid 

(C) STRANTiEDNESS : single 

(D) TOPOLOGY: linear 

(il) MOLECULE I-/PE: CDNA 
(ill) JT/POTHETICAL: NO 
(ili) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE: 

(3) CLONE: BR33-2-17 

(ix) FEATURE: 

(A) NAME/K:EY: CDS 

(B) LOCATION: 1. .213 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

CTC ACG GAG CGG CTT TAC TGC GGG GGC CCT ATG TTC AAC AGC AAG GGG 4 8 

Leu Thr Glu Arg Leu Tyr Cys Gly Gly Pro Met Phe Asn Ser Lys Gly 
15 10 15" 

GCC CAG TGT GGT TAT CGC CGT TGT CGT GCC AGT GGA GTT CTG CCT ACC 96 
Ala Gin Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Pro Thr 
20 25 30 

AGT TTC GGC AAC ACA ATC ACT TGT TAC ATC AAG GCC ACA GCG GCT GCA 144 
Ser Phe Gly Asn Thr He Thr Cys Tyr He Lys Ala Thr Ala Ala Ala 
35 40 45 

AAA GCC GCA GGC CTC CGG AAC CCG GAC TTT CTT GTT TGC GGA GAT GAT 192 
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Lys Ala Ala Giy Leu Arg Asn Pro Asp Phe Leu Val Cys Gly Asp Asp 
50 55 60 

TTG GTC GTG GTG GCT GAG ACT 213 

Leu Val Val Val Ala Glu Ser 
65 70 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) sequence: CHARACTERISTICS: 

(A) LEilGTH: 71 am:.nc ac-ds 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: pronem 

{x:l) sequence DESCRIPTION: SEQ 13 ^^0 : 10: 

Leu Thr Glu Arg Leu Tyr Cys Gly Gly Pro Hez Phe Asn Ser Lys Gly 
15 10 IS 

Ala Gin Cys Gly T-yr Arg Arg Cys Arg Ala Ser Gly Val Leu Pro Thr 
20 23 30 

Ser Phe Gly Asn Thr He Thr Cys Tyr lis Lys Ala Thr Ala Ala Ala 
35 40 45 

Lys Ala Ala Gly Leu Arg Asn Pro Asp Phe Leu Val Cys Gly Asp Asp 
50 55 60 

Leu Val Val Val Ala Glu Ser 
65 70 

(2) INrORMA.TION FOR SEQ ID NO: 11: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTE: 213 base pairs 

(B) TYPE: nucleic ac:.d 

(C) STP.ANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: BR33-2-21 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1 . .213 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
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C7C ACG GAG CGG CTT TAG TGC GGG GGC 
Leu Thr Glu Arg Leu Tyr Cys Gly Giy 
1 5 

GCC GAG TGT GGT TAT CGC CGT TGT CGT 

Ala Gin Cys Gly Tyr Arg Axg Cys Arg 

20 25 

AGT TTC GGC AAC ACA ATC ACT TGT TAG 

Ser ?he Gly Asn Thr lie Thr Cys Tyr 

35 40 

AAA GCC GCA GGC CTC CGG AAC CCG GAC 
Lys Ala Ala Gly Leu Arg Asn Pro Asp 
50 55 

TTG GTC GTG GTG GCT GAG AGT 
Leu Val Val Val Ala Glu Ser 
65 70 



CCT ATG TTC AAC AGC AAG GGG 4 8 

Pro Met Phe Asn Ser Lys Gly 
10 15 

GCC AGT GGA GTT CTG CCT ACC 95 
Ala Ser Gly Val Leu Pro Thr 
30 

ATC AAG GCC ACA GCG GCT GCA 144 
He Lys Ala Thr AJLa Ala Ala 
45 

TTT CTT GTT TGC GGA GAT GAT 192 
Phe Leu Val Cys Gly Asp Asp 
60 

213 



(2) INFORMATION FOR SZQ ID NO: 12: 

(i) SEQU3NCZ CHAP-ACTERISTICS : 

(A) LENGTH: 71 atr.ir.c acids 

(B) T^^^Pi: aniir.c ac^d 
(D) TOPOLOGY: linear 

(ii) MOLECJL.E T{7E: proceir. 

(xi) SEQUHNCS DESCRIPTION: SSQ ID NO: 12: 

Leu Thr Glu Arg Leu Tyr Cys Gly Gly Pro Met Phe Asr. Ser Lys Gly 
15 10 15 

Ala Gin Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Pre Thr 
20 25 30 

Ser Phe Gly Asn Thr He Thr Cys T^/r lie Lys Ala Thr Ala Ala Ala 
35 40 45 

Lys Ala Ala Gly Leu Arg Asn Pro Asp Phe Leu Val Cys Gly Asp Asp 
50 55 60 

Leu Val Val Val Ala Glu Ser 
65 70 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 1 base pairs 

(B) T"f?E: nucleic acid 

(C) S^PJ^EDiTBS S : single 
ID) TOPOLOGY: linear 

(ii} MOLECULE T^iTPE : cDNA 

Clli} HYPOTHETICAL: NO 
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(iii) ANTI- SENSE: NO 



(vii) IMMEDIATE SCCTRCE : 

(B) CLONE: HDlO-2-5 

(ixl r EATUr-E : 

(A) NAME/KEY: CDS 
(E) LOCATION: 2 . . 541 



(Xi) SEQUENCE DESCRIPTION; SEQ ID NO: 13: 



C GTC GGC OCT CCT GTA GGA GGC GTC GCA AGA GCC CTT GCG CAT GGC 
Val Glv Ala Pro Val Gly Gly Val Ala Arg Aia Leu Ala His Gly 
1^5 10 15 

GTG AGG GCC CTT GAA GAC GGG ATA AAT TTC GCA ACA GC-G AAT TTG CCC 
Val Arg Ala Leu Glu Asp Gly He Asn Phe Ala Thr Gly Asr. Leu Pro 
20 25 30 



He 



GGT TGC TCC TTT TCT ATC TTC CTT CTT OCT CTG TTC TCT TGC 
Gly Cys Ser Phe Ser He Phe Leu Leu Ala Leu Phe Ser Cys 
35 40 45 

CAT CCA GCA GCT AGT CTA GAG TGG CGG AAC ACG TCT GGC CTC TAT GTC 
Rls Pro Ala Ala Ser Leu Glu Trp Arg Asn Thr Ser Gly Leu T^.-r Val 
50 55 60 

CTT ACC AAC GAC TGT TCC AAT AGC AGT ATT GTG TAT GAG GCC GAT GAC 
Leu Thr Asn Aso Cvs Ser Asn Ser Ser He Val Tyr Glu Ala A^p Asp 
65 * ' 70 75 

GTT ATT CTG CAC ACA CCC GGC TGT GTA CCT TGT GTT C\G GAC GG- AAi 
Val He Leu Hrs Thr Pro Gly Cys Val Pre C;/s Val Gin Asp Gly Asn 
80 ^ 85 ^ * 90 95 

ACA TCT GCG TGC TGG ACC CCA GTG ACA CCT ACA GTG GCA GTC AC-G TAC 
Thr Ser Ala Cys Trp Thr Pro Val Thr Pro Thr Val Ala Val Arg Tyr 
100 105 11- 

GTC GGA GCA ACC ACC GCT TCG ATA CGC AGG CAT GTA GAC ATG TTG GTG 
Val Glv Ala Thr Thr Ala Ser He Arg Arg His Val As? Mec Leu Val 
115 120 125 

GGC GCG GCC ACG ATG TGC TCT GCT CTC TAC GTG GGT GAT ATG TGT GGG 
Gly Ala Ala Thr Met Cys Ser Ala Leu Tyr Val Gly As? Met Cys. Gly 
130 135 140 

GCC GTC TTC CTC GTG GGA CAA GCC TTC ACG TTC AGA CCT CGT CGC CAT 
Ala Val Phe Leu Val Gly Gin Ala Phe Thr Phe Arg Pro Arg Arg His 
145 150 155 

CAA ACG GTC CAG ACC TGT AAC TGC TCA CTG TAC CZA GGC CAT CTT TCA 
Gin Thr Val Gin Thr Cys Asn Cys Ser Leu Tyr Pro Gly His Leu Ser 
ISO 165 170 175 



46 



94 



19G 



233 



235 



334 



382 



430 



478 



526 
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(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SSQUSNCH: C-lA?J^CTZRISTrCS . 
(A) LENGTH: 18 0 amino acids 
(3) TYPE: anir.o acid 
(D) TOPOLOGY: Ixnear 

(11) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val 
15 10 15 

Arg Ala Leu Glu As? Gly He Asn Phe Ala Thr Gly Asn Leu Pro Glv 
20 25 " 30 ' 

Cys Ser Phe Ser He Phe Leu Leu Ala Leu Phe Ser Cys Leu He His 
35 40 45 

Pro Ala Ala Ser Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val Leu 
50 55 63 

Thr Asn Asp Cys Ser Asn Ser Ser He Val T^/r Glu Ala Asn Ast: Val 

^5 70 75 " " 80 

He Leu His Thr Pro Gly Cys Val Pro Cys Val Gin Aso Glv Asn Thr 

85 90 ~ ' 95 

Ser Ala Cys Trp Thr Pro Val Thr Pro Thr Val Ala Val Arg Tyr Val 
100 105 110 

Gly Ala Thr Thr Ala Ser He Arg Arg His Val Asp Me- Leu Val Gly 
115 120 ' 125 

Ala Ala Thr Met Cys Ser Ala Leu Tyr Val Gly Aso Met Cys Gly Ala 
130 135 140 

Val Phe Leu Val Gly Gin Ala Phe Thr Phe Arg Pro Arg Arg Kis Gin 
145 150 155 

Thr Val Gin Thr Cys Asn Cys Ser Leu Tyr Pro Gly His Leu Ser Gly 
165 170 175 

His Arg Met Ala 
180 

(2) INFORMATION FOR SEQ ID NC : 15: 

(i) SEQUENCE CHAR-\CTERISTICS : 

(A) LENGTH: 541 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: CDNA 
(lli) HYPOTHETICAL: NO 
(ill) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE; 

(3) CLOirE: KD10-:-14 

(IX) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2.. 541 



(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

C GTC GGC OCT CCT GTA GGA GGC GTC GCA AGA GCC CTT GCG CAT GGC 
Val Giy Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly 
1 ' 5 10 15 



GTG AGG GCC CTT GAA 



GGG ATA AAT TTC GCA ACA GGG AAT TTG CCC 



Val Arg Ala Leu Glu As? Gly lie Asn Phe Ala Thr Gly Asn Leu Pro 



20 



25 



30 



GGT TGC TCC TTT TCT ATC TTC CTT CCT GCT CTG TTC TCT TGC TTA ATC 
Gly Cys Ser Phe Ser lie Phe Leu Pro Ala Leu Phs Ser Cys Leu lie 
35 40 45 

CAT CCA GCA GCT AGT CTA GAG TGG CGG AAC ACG TCT GGC CTC TAT GTC 
His Pro Ala Ala Ser Leu Glu Trp Arg Asn Thr Ser Giy Leu Tyr Val 
50 55 SO 

CTT ACC AAC GAC TGT TCC AAT AGC AGT ATT GTG TAT GAG GCC GAT GAC 
Leu Thr Asn Asp Cys Ser Asn Ser Ser He Val Tyr Glu Ala Asp Asp 
65 70 75 

GTT ATT CTG CAC ACA CCC GGC TGT GTA CCT TGT GTT CAG GAC GGT AAT 
Val He Leu His Thr Pro Gly Cys Val Pro O/s Val Gin Asp Gly Asn 
80 85 90 55 

ACA TCT GCG TGC TGG ACC CCA GTG ACA CCT ACA GTG GCA GTC AGG TAC 
Thr Ser Ala Cys Trp Thr Pro Val Thr Pro Thr Val Ala Val Arg Tyr 
100 105 110 

GTC GGA GCA ACC ACC GCT TCG ATA CGC AGG CAT GTA GAC ATA TTG GTG 
Val Glv Ala Thr Thr Ala Ser He Arg Arg His Val Asp He Leu Val 
115 120 125 

GGC GCG GCC ACA ATG TGC TCT GCT CTC TAC GTG GGT GAT ATG TGT GGG 
Gly Ala Ala Thr Mez Cys Ser Ala Leu Tyr Val Gly Asp Mec Cys Gly 
130 135 140 

GCC GTC TTC CTC GTG GGA CAA GCC ^TC ACG TTC AGA CCT CGT CGC CAT 
Ala Val Phe Leu Val Gly Gin Ala Phe Thr Phe Arg Pro Arg Arg His 
145 150 155 

CAA ACG GTC CAG ACC TGT AAC TGC TCA CTG TAC CCA GGC CAT CTT TCA 



190 



233 



285 



33 



33 



43 



41 
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Gin Thr Val Gin Thr Cys Asn Cys Ser Leu f/r Pre Giy Hxs Leu Ser 
160 165 170 175 

GGA CAC CGA ATG OCT g^-, 
Gly His Arg Mec Ala 

190 



(2) INFORMATION FOR SEQ ID NO; 15: 

(i) SEQL-ENCi C-iARACTERISTICS : 
(A) LENGTH: 130 amine acids 
(3) T^/PE: amixno acid 
{D} TOPOLOGY: linear 

(ii) MCLECtTLE T:?E: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: IS: 

Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val 
i 5 10 15 

Arg Ala Leu Glu Asp Gly He Asn ?he Ala Thr Gly Asn Leu Pro Glv 
20 25 ' 30 ^ 

Cys Ser Phe Ser He ?he Leu Pro Ala Leu Phe Ser Cvs Leu He His 
35 40 45 

Pro Ala Ala Ser Leu Glu Trp Arg Asn Thr Ser Glv Leu T^/r Val Leu 
50 55 SO ' ^ ^ 

Thr Asn Asp Cys Ser Asn Ser Ser He Val T\- Glu Ala Asd Asd Val 

70 75 " " 80 

He Leu His Thr Pro Gly Cys Val Pro Cys Val Gin Asp Gly Asn Thr 
as 90 ■ ^ 95 

Ser Ala Cys Tr? Thr Pro Val Thr Pro Thr Val Ala Val Arg Tyr Val 
100 105 110 

Gly Ala Thr Thr Ala Ser He Arg Arg His Val As? He Leu Val Gly 
115 120 " 125 

Ala Ala Thr Met Cys Ser Ala Leu Tyr Val Gly Asp Met Cys Gly Ala 
130 135 ' 140 

Val Phe Leu Val Gly Gin Ala Phe Thr Phe Arg Pro Arg Arg His Gin 
145 150 155 -igQ 

Thr Val Gin Thr Cys Asn Cys Ser Leu Tyr Pro Gly His Leu Ser Gly 
155 170 175 

His Arg Met Ala 
130 

(2) INFORMATION FOR SEQ ID NO : 17 : 
(l) SEQUENCE CHARACTERISTICS: 
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(A) LZNGTK: 541 base pair; 

(3) TYPE: nucleic acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
(lii) HYPOTHETICAL: NO 
(lii) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: KDlO-2-21 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2 . . 541 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1^: 

C GTC GGC GCT OCT GTA GGA GGC GTC GCA AGA GCC CTT GCG CAT GGC 
Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly 
1 5 10 IS 

GTG AGG GCC CTT GAA GAC GGG ATA AAT TTC GCA ACA GC-G AAT TTG CCC 
Val Arg Ala Leu Glu Aso Gly lis A^n Phe Ala T^.r Gly Asn Leu Pro 

30 



46 



94 



20 



25 



GGT TGC TCC TTT TCT ATC TTC CTT CTT GCT CTG TTC TCT TGC TTA ATC 
Gly Cys Ser Phe Ser He Phe Leu Leu Ala Leu Phe Ser Cys Leu He 
35 



40 



45 



CAT CCA GCA GCT AGT CTA GAG TGG CGG AAC ACG TCT GGC CTC TAG GTC 
K-3 Pro Ala Ala Ser Leu Glu Trp Arg Asn Thr Ser Gly Leu T>'r Val 
5 0 55 6 3 

CTT ACC AAC GAC TGT TCC AAT AGC AGT ATT GTG TAT GAG GCC GAT GAC 
Leu Thr Asn Asc Cys Ser Asn Ser Ser He Val Tyr Glu Ala Asp Asp 
65 ' 70 75 

GTT ATT CTG CAC ACA CCC GGC TGT GTA OCT TGT GTT CAG GAC GGT AAT 
val He Leu His Thr Pro Gly Cys Val Pro Cys Val Gin As? Gly Asn 
80 95 90 55 

ACA TCT GCG TGC TGG ACC CCA GTG ACA OCT ACA GTG GC^ GTC AGG TAG 
Thr Ser Ala Cys Trp Thr Pro Val Thr Pro Thr Val Ala Val Arg Tyr 
100 105 HO 

GTC GGA GCA ACC ACC GCT TGG ATA CGC AGG C^^T GTA GAC ATA TTG GTG 
val Gly Ala Thr Thr Ala Ser He Arg Arg His Val As? He Leu Val 
115 120 125 

GGC GCG GCC ACG ATG TGC TCT GCT CTC TAC GTG C<;T GAT ATG TGT GGG 
Gly Ala Ala Thr Mec Cys Ser Ala Leu Tyr Val Gly Asp Met Cys Gly 
130 135 



142 



190 



238 



23b 



334 



332 



430 
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GCC GTC TTC CTC GTG GGA CAA GCC TTC ACG TTC AGA CCT CGT CGC CAT 
Ala Val Phe Leu Val Gly Gin Ala Phe Thr Phe Arg Pro Arg Arg HiS 
1^5 150 155 

CAA ACG GTC CAG ACC TGT AAC TGC TCA CTG TAC CCA GGC CAT CTT TCA 
Gin Thr Val Gin Thr Cys Asn Cys Sar Leu Tyr Pro Glv His Leu Ser 
155 170 ' 175 

GGA CAC CGA ATG GCC 
Gly His Arg Met: Ala 
180 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQtJENCS C-iARACTERISTICS: 
(A) LENGTH: 180 ammo acids 
O) TYPE: ammo acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE T'/PE: prccem 

(xi) SEQL^NCE DESCRIPTrON: SEQ 13 NO: 13- 

Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Glv Val 
^ 5 10 15 

Arg Ala Leu Glu As? Gly He Asn Phe Ala Thr Gly Asn Leu Pro Gly 
20 25 30 

Cys Ser Phe Ser He Phe Leu Leu Ala Leu Phe Ser Cys Leu He His 
35 40 45 

Pro Ala Ala Ser Leu Glu Trp Arg Asn Thr Ser Gly Leu T>'r Val Leu 
50 55 SO 

Thr Asn Asp Cys Ser Asn Ser Ser He Val Tyr Glu Ala Asp Asp Val 
^5 70 75 " 80 

He Leu His Thr Pro Gly Cys Val Pro C>-s Val Gin As? Gly Asn Thr 
85 90 95 

Ser Ala Cys Trp Thr Pro Val Thr Pro Thr Val Ala Val Arg Tyr Val 
100 105 110 

Gly Ala Thr Thr Ala Ser He Arg Arg His Val Asp He Leu Val Gly 
115 120 125 

Ala Ala Thr Met: Cys Ser Ala Leu Tyr Val Gly Asp Met Cys Gly Ala 
130 135 140 

Val Phe Leu Val Gly Gin Ala Phe Thr Phe Arg Pro Arg Arg His Gin 
1^5 ISO 155 160 

Thr Val Gin Thr Cys Asn Cys Ser Leu Tyr Pro Gly His Leu Ser Gly 
165 170 175 

His Arg Met: Ala 



478 



526 



541 
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(2} INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 541 base pa:Lr3 

(B) TYPE: nucleic acid 
(Cl S'V?SylTZZZ:^Tz.SS: sir.rle 
(D) TOPOLOGY; linear 

(ii) MOLECULE TrP£: cDNA 

(ill) HYPOTHETICAL: NO 

(iii) ANTI- SENSE- NO 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: BR36-9-13 

(IX) FEATL-RE: 

{A} NAME/KEY: CDS 
(B) LOCATION: 2 . . 541 



(XI } SEQUENCE DESCRIPTION: SEQ ID NC : 19: 

C GTC GGC GCT CCC GTA GGA GGC GTC GCA AGA GCC CTT GCG CAT GGC 
Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly 
1 5 10 15 

GTG AGG GCC CTT GAA GAC GGG ATA AAT TTC GCA ACA GGG AAT TTG CCC 
Val Arg Ala Leu Glu Asp Gly lie Asn Pha Ala Thr Gly Asn Leu Pre 
20 ' 23 30 

GGT TGC TCC TTT TCT ATT TTC CTT CTT GCT CTG TTC TCT TGC TTA ATT 
Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Phe Ser Cys Leu lie 
35 40 45 

CAT CCA GCA GCT ACT CTA GAG TGG CGG AAT ACG TCT GGC CTC TAT GTC 
His Pro Ala Ala Ser Leu Glu Trp Arg Asr. Thr Ser Gly Leu Tyr Val 
SO 55 60 

CTT ACC AAC GAC TGT TCC AAT AGC AGT ATT GTG TAC GAG GCC GAT GAC 
Leu Thr Asn Asp Cys Ser Asn Ser Ser lie Val T>'r Glu Ala Asp Asp 
65 ' 70 75 

GTT ATT CTG CAC ACA CCC GGC TGC ATA OCT 7GT GTC CAG GAC GGC AAT 
Val lie Leu His Thr Pro Gly Cys lie Pro Cys Val Gin Asp Gly Asn 
80 85 90 95 

ACA TCC ACG TGC TGG ACC CCA GTG ACA OCT ACA GTG GCA GTC AAG TAC 
Thr Ser Thr Cys Trp Thr Pro Val Thr Pro Thr Val Ala Val Lys Tyr 
100 105 110 

GTC GGA GCA ACC ACC GCT TCG ATA CGC AGT CAT GTG GAC CTA TTA GTG 
Val Gly Ala Thr Thr Ala Ser lie Arg Ser His Val Asp Leu Leu Val 
115 120 125 
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GGC GCG GCC ACG ATG TGC TCA GCG CTC TAG GTG GGT GAT ATG TGT GGG 43 0 

Gly Ala Ala Thr Mec Cys Ser Ala Leu Tyr Val Gly Asp Met Cys Gly 
130 L3S 140 

GCC G7C TTC CTT GTG GGA CAA GCC TTC ACG TTC AGA CCT CGT CGC CAT 4 73 

Ala Val Phe Leu Val Gly Gin Ala Phe Thr Phe Arg Pro Arg Arg His 
145 150 155 

CAA ACG GTG CAG ACG TGT AAC TGC TCG CTG TAC CCA GGC CAT CTT TCA 525 
Gin Thr Val Gin Thr Cys Asn Cys Ser Leu Tyr Pro Gly His Leu Ser 
160 165 170 175 

GGA CAT CGA ATG GCT 541 
Gly His Arg Met Ala 
130 



{2} INFORMATION FOR SEQ ID NO : 20: 

(i) SEQUENCH: CHARACTERISTICS: 
{A) LENGTH: 180 air.mc acids 
(3) TYPE: amine acid 
(D) TOPOLOGY: linear 

(11) MOLECULE r^rPE; protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala Kis Gly Val 
15 10 15 

Arg Ala Leu Glu Asp Gly He Asn Phe Ala Thr Gly Asn Leu Pro Gly 
20 25 30 

Cys Ser Phe Ser He Phe Leu Leu Ala Leu Phe Ser Cys Leu He His 
35 40 45 

Pro Ala Ala Ser Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val Leu 
50 55 60 

Thr Asn Asp Cys Ser Asn Ser Ser He Val Tyr Glu Ala Asp Asp Val 
65 70 75 80 

He Leu Kis Thr Pro Gly Cys He Pro Cys Val Gin Asp Gly Asn Thr 
85 90 95 

Ser Thr Cys Trp Thr Pro Val Thr Pro Thr Val Ala Val Lys Tyr Val 
100 105 110 

Gly Ala Thr Thr Ala Ser He Arg Ser His Val Asp Leu Leu Val Gly 
115 120 125 

Ala Ala Thr Met Cys Ser Ala Leu T>^r Val Gly Asp Met Cys Gly Ala 
130 135 140 

Val Phe Leu Val Gly Gin Ala Phe Thr Phe Arg Pro Arg Arg His Gin 
145 150 155 150 

Thr Val Gin Thr Cys Asn Cys Ser Leu Tyr Pro Gly His Leu Ser Gly 
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165 170 175 

His Arg Mec Ala 
190 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SECUHNC3 CKARACTEHISTICS : 

(A) LENGTH: 541 base pairs 
(3) TY?£ : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPCLCGV: linear 

(ii) MOLECULE TYPE: cDNA 
(ill) HYPOTHETICAL: NO 
(ill) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLON-E: BR35-9-20 

(IX ) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2., 541 



(XI) SEQUENCE DESCRIPTION: SEQ ID NO : 21: 

C GTC GGC GCT CCC GTA GGA GGC GTC GCA AGA GCC CTT GCG CAT GGC 4 6 

Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Le- Ala His Gly 
15 10 15 

GTG AGG GCC CTT GAA GAC GGG ATA AAT TTC GCA ACA GGG AAT TTG CCC 94 
Val Arg Ala Leu Glu Asp Gly He Asn Phe Ala Thr Gly Asn Leu Pro 
20 25 30 

GGT TGC TCC TTT TCT ATT TTC CTT CTT GCT CTG TTC TCT TGC TTA ATT 14 2 

Gly Cys Ser Phe Ser He Phe Leu Leu Ala Leu Phe Ser Cys Leu He 
35 40 45 

CAT CCA GCA GCT AGT CTA GAG TGG CGG AAT ACG TCT GGC CTC TAT GTC 190 
His Pro Ala Ala Ser Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val 
50 55 60 



CTT ACC AAC GAC TGT TCC AAT AGC AGT ATT GTG TAC GAG GCC GAT GAC 
Leu Thr Asn Asp Cys Ser Asn Ser Ser He Val Tyr Glu Ala Asp Asp 

65 70 75 

GTT ATT CTG CAC ACA CCC GGC TGC ATA CCT TGT GTC CAG GAC GGC AAT 
Val He Leu His Thr Pro Gly Cys He Pro Cys Val Gin Asp Gly Asn 
80 85 90 55 

ACA TCC ACG TGC TGG ACC CCA GTG ACA CCT ACA GTG GCA GTC AAG TAC 
Thr Ser Thr Cys Trp Thr Pro Val Thr Pro Thr Val Ala Val Lys Tyr 
100 105 110 



238 



28S 



334 
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GTC GGA GCA ACC ACC GCT TCZ ATA CGC AGT CAT GTG GAC CTA TTA GTG 382 
Val Gly Ala Thr Thr Ala Ser He Arg Ser His Val Asp Leu Leu Vai 
115 120 125 

GGC GCG GCC ACG ATG TGC TCT GCG CTC TAG GTG GGT GAC ATG TGT GGG 43 0 

Gly Ala Ala Thr Met Cys Ser Ala Leu Tyr Val Gly Asp Mec Cys Gly 
130 135 140 

GCT GTC TTC CTC GTG GGA CAA GCC TTC ACG TTC AGA CCT CGT CGC CAT 4 73 

Ala Val ?he Leu Val Gly Glr^ Ala ?he Thr Phe Arg Pro Arg Arg His 
145 130 155 

CAA ACG GTC CAG ACC TGT AAC TGC TC3 CTG TAC CCA GGC CAT CTT TCA 526 
Gin Thr Val Gin Thr Cys Asn Cys Ser Leu Tyr Pro Gly His Leu Ser 
160 165 170 175 

GGA CAT CGA ATG GCT 541 
Gly His Arg Mec Ala 
130 



(2) INFORMATION FOR SEQ ID NO: 22: 

(l) SEQXjz.yiCz. CHARACTERISTICS: 
(A) LENGTH: 18 0 amino acids 
(31 T^PE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE rf ?E : proriein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val 
15 10 15 

Arg Ala Leu Glu Asp Gly lie Asn Phe Ala Thr Gly Asa Leu Pre Gly 
20 25 30 

Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Phe Ser Cys Leu lie His 
35 40 45 

Pro Ala Ala Ser Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val Leu 
50 55 60 

Thr Asn As? Cys Ser Asn Ser Ssr lie Val Tyr Glu Ala Asp Asp Val 
65 70 75 80 

He Leu His Thr Pro Gly Cys He Pro Cys Val Gin Asp Gly Asn Thr 

85 90 95 

Ser Thr Cys Trp Thr Pro Val Thr Pro Thr Val Ala Val Lys Tyr Val 
100 1C5 110 

Gly Ala Tnr Thr Ala Ser He Arg Ser His Val Asp Leu Leu Val Gly 
115 120 - 125 

Ala Ala Thr Met Cys Ser Ala Leu T;/r Val Gly Asp Met: Cys Gly Ala 
130 135 140 



SUBSTITUTE SHEET (RULE 26) 



IJ SJ 3 IB lEs *!! „, 1,1! B StI I 
wo 94/25601 ^^-^ PCT/EP94/01323 

Val Phe Leu Val Gly Gin Ala ?he Thr Phe Arg Pro Arg Arg Hxs Gin 
145 150 155 ISO 

Thr Val Gin Thr Cys Asn Cys Ser Leu Tyr Pro Gly His Leu Ser Gly 
155 170 175 

His Arg Met Ala 

150 

{2) INFORMATION FOR SZQ IZ NO: 23: 

(i) SECUZITCE C-:ARACTZ:II 3TI CS: 

(A) LENGTH: 541 base pairs 
(E) TYPE: nucle-C acid 
{C} STRANDEDNHSS : single 
(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: CCNA 

(iii) HYPOTHETICAL: NC 

(lii) ANTl- SENSE: NC 



(vii) IMMEDIATE SOLACE . 

(B) CLONE: B?-33-l-10 

(ix) FEATURE: 

(A) NAME/yJEY: CE3 
(5) LOC=.TI0N: 2 . . 541 



GTT ATT CTG CAC GCG CCC C^I-C TGT GTA OCT TGT GTC CAG GAC GGC AAT 
Val He Leu His Ala Pro Gly Cys Val Pro Cys Val Gin Asp Gly Asa 
80 85 90 95 



4€ 



(xi) SEQL^NCE DESCF.IPTION : SEQ ID NO: 23: 

C GTC GGC GCT CCC GTA GGA GGC GTC GCA AGA GCC CTT GCG CAT GGC 
Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala H-s Gly 
15 10 15 

GTG AGG GCC CTT GAG GAC GGG ATA AAC TTC GCA ACA GGG AAT TTG CCC 
Val Arg Ala Leu Glu As? Gly He Asn Phe Ala Thr Gly Asn Leu Pro 
20 25 30 

GGT TGC TCC TTT TCT ATC TTC CTT CTT GCT CTG TTC TCT TGC TTA ATC 
pv^e Ser He Pr.e Leu Leu Ala Leu Phe Ser Cys Leu He 
35 40 45 

CAT CCA GCA GCT GGT CTA GAG TC-G CGG AAT ACG TCT GGC CTC TAT GTC 
His Pro Ala Ala Gly Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val 
50 55 60 

CTT ACC AAC GAC TGT TCC A^T AGT AGT ATT GTG TAT GAG GCC GAT GAC 23 S 

Leu Thr Asn As? Cys Ser ^-^t. Ser Ser He Val Tyr Glu Ala As? As? 
63 ^ 70 75 



14: 
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ACG TCT ACA TGC TGG ACC CCA GTA ACA CCT ACA GTG GCA GTC AGG TAG 3 34 

Thr Ser Thr Cys Trp Thr Pro Val Thr Pro Thr Val Ala Val Arg Tyr 

100 105 110 

GTC GGG GCA ACC ACC GCT 7CG ATA CGC AGT CAT GTG GAG CTG TTA GTA 3 32 

Val Gly Ala Thr Thr Ala Ser lie Arg Ser Hxs Val Asp Leu Leu Val 
113 120 125 

GGC GCG GCC ACG ATG TGC TCT GCG CTT TAC GTG GGT GAT ATG TGT GGG 43 0 

Gly Ala Ala Thr Mec Cys Ser Ala Leu Tyr Val Gly Asp Mec Cys Gly 
130 135 140 

GCC GTC TTC CTC GTG GGA CAA GCC TTC ACG TTC AGA CCC CGC CGC CAT 4 78 

Ala Val Phe Leu Val Gly Gin Ala ?he Thr Phe Arg Pro Arg Arg His 
145 150 155 

CAA ACG GTC CAG ACC TGT AAC TGC TCG CTG TAC CCA GGC CAT CTT TCA 52 S 

Gin Thr Val Gin Tnr Cys Asn Cys Ser Leu Tyr Pro Gly His Leu Ser _ 
150 155 170 175 

GGA CAT CGC ATG GCT 541 
Gly His Arg Mec Ala 

180 



(21 INFORMATION FOR SHQ Ii: : 24: 

(i) SEQL^NCS CHARACTERISTICS: 
(A) LENGTH: 190 amino acids 
(3) T{'PE: ammo acid 
(D) TOPOLOGY: linear 

(ii) MOLECLTLF r/PE : pr^-ein 

(xi) SEQUENCE DESCF.IPTION: SEQ ID NO: 24: 

Val Gly Ala Pre Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val 
15 10 15 

Arg Ala Leu Glu Asp Gly lie Asn Phe Ala Thr Gly Asn Leu Pro Gly 
20 25 30 

Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Phe Ser Cys Leu lie His 
35 40 45 

Pro Ala Ala Gly Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val Leu 
50 55 60 

Thr Asn Asp Cys Ser Asn Ser Ser lie Val Tyr Glu Ala Asp Asp Val 
65 70 75 80 

lie Leu His Ala Pro Gly Cys Val Pro Cys Val Gin Asp Gly Asn Thr 
85 90 95 

Ser Thr Cys Trp Thr Pro Val Thr Pro Thr Val Ala Val Arg Tyr Val 
100 105 110 

Gly Ala Thr Thr Ala Ser lie Arg Ser His Val As? Leu Leu Val Gly 
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Ala Ala Thr Me- Cvs Ser Ala Leu Tyr Val Gly As? Mec Cys Gly Ala 
130 135 140 

Val Phe Leu Val Giv Glr. Ala Phe Thr Phe Arg Pro Arg Arg His Glr. 
145 ' 150 155 150 

Thr Val Gin T.^.r Cys Asr. Cys Ser Leu Tyr Pro Gly His Leu Ser Gly 
I5z 170 175 



His Arg Mec Ala 
ISO 

(2) INFORMATION FOR SZQ ID NO: 25: 

(i) SEQUENCF CHARACTERISTICS: 

(A) LENGTH: 541 base pairs 
(3) T"/?h:: nucleic acid 

(C) STRANDHDNZSS : single 

(D) TOPOLOGY: linear 



(il) MOLECJLZ rf?Z: cHNA 
(ill) H^/PCTHHTICAI.: NO 
(ill) ANTI-SENSE. NO 



(vii) IMMEDIATE SOLACE: 

(B) CLO^TE: 3R33-1-19 

(ix) FEATURE: 

(A) na:-ie/:<ev: CDS 

(B) LOCATION: 2 . . 541 



(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

C GTC GGC GCT CCC GTA GGA GGC GTC GCA AGA GCC CTT GCG CAT GGC 
Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly 
1 5 10 

GTG AGG GCC CTT GAG GAC GGG ATA AAC TTC GCA AC^ GC^ A.:^T TTG CCC 
val Arg Ala Leu Glu As? Gly He Asn Phe Ala Thr Gly Asn Leu Pro 
20 25 30 

GGT TGC TCT TTT TCT ATC TTC CTT CTT GCT CTG TTC TCT TGC TTA ATC 
Gly Cys Se- Phe Ser He Phe Leu Leu Ala Leu Phe Ser Cys L^u He 
35 40 45 

CAT CCA GCA GCT GGT CTA GAG TGG CGG AAT ACG TCT C^-C CTC TAT GTC 
H^s Pro Ala Ala Gly Leu Glu Tr? Arg Asn Thr Ser Gly Leu Tyr Va. 
" 50 55 SO 

CTT ACC AAC GAC TGT TCC AAT AGT AGT ATT GTG TAT GAG GCC GAT GAC 
Leu Thr Asn Aso Cys Ser Asn Ser Ser lie Val Tyr Glu Ala As? As? 
65 70 75 



4d 
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GTT ATT CTG CAC GCG CCC GGC TGT GTA CCT TGT GTC GAG GAG GGC AAT 
Val He Leu His Ala Pro Giy Cys Val Pro Cys Vai Gin Asp Gly Asn 
80 85 90 " 95 



286 



ACG TCT ACA TGC TGG AC" CCA GTA ACA CCT ACA GTG GCA GTC AGG TAG 334 
Thr Ser Thr Cys Trp Thr Pro Val Thr Pro Thr Val Ala Val Arg Tyr 
100 105 110 

GTC GGG GCA ACC ACC GCT TCG ATA CGC AGT CAT GTG GAC CTG TTA GTA 38 2 

Val Gly Ala Thr Thr Ala Ser He Arg Ser His Val As? Leu Leu Val 
115 120 125 

GGC GCG GCC ACG ATG TGC TCT GCG CTT TAG GTG GGT GAT ATG TGT GGG 43 0 

Gly Ala Ala Thr Met Cys Ser Ala Leu Tyr Val Gly Asu Mez Cys Gly 
130 135 140 

GCC GTC TTC CTC GTG GGA CAA GCC TTC ACG TTC AGA CCC CGC CGC CAT 4 78 

Ala Val Phe Leu Val Gly Gin Ala Phe Thr Phe Arg Pro Arg Arg His 
145 ISO 155 

CAA ACG GTC CAG ACC TGT AAC TGC TCG CTG TAG CCA GGC CAT CTT TCA 52 S 

Gin Thr Val Gin Thr Cys Asn Cys Ser Leu Tyr Pro Glv K^s Leu Ser 
150 165 170 " 175 

GGA CAT CGA ATG GCT 54-. 
Gly His Arg Het Ala 
130 



(2) INFORMATION FOR SEQ ID NO: 26: 

{i} SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 13 0 ammo acids 
(3) TYPE: amz.no acid 
(D) TOPOLOGY: linear 

(ii) MOLECjLE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val 
15 10 15 

Arg Ala Leu Glu Asp Gly lie Asn Phe Ala Thr Gly Asn Leu Pro Gly 
20 25 30 

Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Phe Ser Cys Leu He His 
35 40 45 

Pro Ala Ala Gly Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val Leu 
50 55 60 

Thr Asn Asp Cys Ser Asn Ser Ser He Val T-yr Glu Ala Asp Asp Val 

70 75 80 

He Leu His Ala Pro Gly Cys Val Pro Cys Val Gin Asp Gly Asn Thr 
85 90 95 
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Ser Thr Cys Trp Thr Pro Val Thr Pro Tlir Val Ala Val Arg Tyr Val 
100 105 110 

Gly Ala Thr Thr Ala Ser lie Arg Ser His Val Asp Leu Leu Val Gly 
115 120 125 

Ala Ala Thr Mec Cys Ser Ala Leu Tyr Val Gly Asp Met Cys Gly Ala 

130 133 140 

Val Phe Leu Val Gly Glr. Ala Phe Thr Phe Arg Pro Arg Arg His Gin 
145 130 155 160 

Thr Val Gin Thr Cys Asn Cys Ser Leu Tyr Pro Gly His Leu Ser Gly 
1€S 170 175 

His Arg Mec Ala 
ISO 

(2) INFCRiMATION rCP. SEC ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LE^rcTH: 541 base pairs 

(B) TYPE: nucleic acid 

(C) STRAirDECNESS : single 

(D) TOPOLOGY: linear 

(li) MCLECULE Tf3S: cDNA 
(iii) HYPOTHETICAL: NO 
(ili) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLON-E; HR33-1-20 
ixx) FEATURE: 

{A} NAME/KEY: CDS 
(B) LOCATION. 2 . . 541 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 27: 

C GTC GGC GCT CCC GTA GGA GGC GTC GCA AGA GCC CTT GCG CAT GGC 4 
Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly 
1 S 10 15 

GTG AGG GCC CTT GAG GAC GGG ATA AAC TTC GCA ACA GGG AAT TTG CCC 9 
Val Arg Ala Leu Glu Asp Gly lie Asn Phe Ala Thr Gly Asn Leu Pro 
20 25 30 

GGT TGC TCT TTT TCT ATC TTC CTT CTT GCT CTG TTC TCT TGC TTA ATC 1^ 
Gly Cys Ser Phe Ser He Phe Leu Leu Ala Leu Phe Ser Cys Leu He 
35 40 45 

CAT CCA GCA GCT GGT CTA GAG TGG CGG A^T ACG TCT GGC CTC TAT GTC 13 
His Pro Ala Ala Gly Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val 
50 55 60 
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CTT ACC AAC GAC TGT TCC AAT AGT AGT ATT GTG TAT GAG GCC GAT GAC 23 3 

Leu Thr Asn Asp Cys Ser Asn Ser Ser lie Val Tyr Glu Ala Asp Asp 
63 70 75 

GTT ATT CTG CAC GCG CGC GGC TGT GTA CCT TGT GTC GAG GAC GGC AAT 28 6 

Vai He Leu His Ala Pro Gly Cys Val Pro Cys Val Gin Asp Gly Asr. 
80 as 90 9S 

ACG TCT ACA TGC TGG ACC CCA GTA ACA CCT ACA GTG GCA GTC AGG TAC 3 34 

Thr Ser Thr Cys Trp Thr Pro Val Thr Pro Thr Val Ala Val Arg 

100 105 110 

GTC GGG GCA ACC ACC GCT TCG ATA CGC AGT CAT GTG GAC CTG TTA GTA 3 32 

Val Gly Ala Thr Thr Ala Ser He Arg Ser Hxs Val Asp Leu Leu Val 
115 120 125 

GGC GCG GCC ACG ATG TGC TCT GCG CTT TAC GTG GGT GAT ATG TGT GGG 43 0 

Gly Ala Ala Thr Mez Cys Ser Ala Leu Tyr Val Gly Asp Met Cys Gly 
130 135 140 

GCC GTC TTC CTC GTG GGA CAA GCC TTC ACG TTC AGA CCC CGC CGC CAT 473 

Ala Val Phe Leu Val Gly Gin Ala Phe Thr Phe Arg Pro Arg Arg Hls 
145 130 15= 

CAA ACG GTC CAG ACC TGT AAC TGC TCG CTG TAC CCA GGC CAT CTT TCA 5 25 

Glr. Thr Val Gin Thr Cys Asn Cys Ser Leu Tyr Pro Gly K:ls Leu Ser 



150 1S5 170 

GGA CAT CGA ATG GCT 
Gly His Arg Met Ala 

180 



(2) INFORMATION FOP. SZQ ID NO: 23: 

(i) SEQUcNCH CHAPACTiRISTICS : 
(A) LENGTH: 180 amino acids 
(3) Ti'PE : amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val 

15 10 15 

Arg Ala Leu Glu Asp Gly He Asn Phe Ala Thr Gly Asn Leu Pro Gly 
20 25 30 

Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Phe Ser Cys Leu He His 
35 40 45 

Pro Ala Ala Gly Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val Leu 
50 55 60 

Thr Asn Asp Cys Ser Asn Ser Ser He Val T%'r Glu Ala Asp Asp Val 
65 70 75 80 
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lie Leu His Ala Pro Gly Cys Val Pro Cys Val Gin Asp Gly Asn Thr 
25 90 95 

Ser Thr Cys Trp Thr Pro Val Thr Pro Thr Val Ala Val Arg Tyr Val 
100 105 110 

Gly Ala Thr Thr Ala Ssr II- Arg Ser H-S Val Asp Leu Leu Val Gly 
115 120 125 

Ala Ala Thr Met Cys Ser .Ala Leu Tyr Val Gly Asp Me- Cys Gly Ala 
130 135 140 

Val Phe Leu Val Gly Gin Ala ?he Thr Phe Arg Pro Arg Arg His Gin 
145 ISO 155 160 

Thr Val Gin Thr Cys Asn Cys Ser Leu Tyr Pro Gly His Leu Ser Gly 
155 170 175 

His Arg Met Ala 
180 

{2^ INFORMATION FOR SEQ II NO: 23: 

(i) SHQUZNCE CHAilACTZF-ISTICS : 

(A) LENGTH: 23 7 base pairs 

(B) T":r?E : nucleic acid 

(C) STrlANDEDNZSS : Single 

(D) TOPOLOGY: linear 

(li) MOLECULE T*/?E; CCNA 
(lii) H-^POTHETICAL: NC 
(ili) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE: 

tB) CLONE: HCC1153 

(ix) FEATL-RE: 

(A} NAME/KEY: C2S 
(3) LOCATION: 3.. 28 7 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

TA GAC TTT TGG GAG AGO GTC TTC ACT GGA CTA ACT CAC ATA GAT GCC 4 
Asp Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His lie Asp Ala 
15 10 15 

CAC TTT CTG TCA CAG ACT A.^G CAG CAG GGA CTC AAC TTC TCG TTC CTG 9 
His Phe Leu Ser Gin Thr Lys Gin Gin Gly Leu Asn Phe Ser Phe Leu 
20 25 30 

ACT GCC TAC CAA GCC ACT GTC- TGC GOT CGC GCG CAG GCT CCT CCC CCA 1^ 

Thr Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro 

35 40 45 

ACT TGG GAC GAG ATG TGG A.AG TGT CTC GTA CGG CTT AAG CCA ACA CTA 19 
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Ser Trp Asp Giu Met: Trp Lys Cys Leu Val Arg Leu Lys Pro Thr Leu 

50 55 

CAT GGA CCT ACG CCT CTT CTA TAT CGG TTG GGG CCT GTC CAA AAT GAA 23 9 

H-s Gly Pro Thr Pro Leu Leu Tyr Arg Leu Giy Pro Val Gin Asn Glu 

65 70 75 

ATC TGC TT" ACA CAC CCC ATC ACA AAA TAC ATC ATG GCA TGC ATG TCA 29 7 

He Cys Leu Thr H-3 Pro lie Thr Lys Tyr lie Mec Ala Cys Met Ser 

80 85 90 95 



(2) INFORMATION FOR SEQ ID NO : 30: 

(i) SSQL^NCE CHAPJVCTFRISTICS : 

(A) LENGTH: 95 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECJLE T^/?E: procein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NC ■ 3C. 

Asp Phe Trp Glu Ser Val Phs Thr Gly Leu Thr H-s lie Asp Ala H:is 
15 10 15 

Phe Leu Ser Gin Thr Lys Gin Gin Gly Leu Asn Phe Ser Phe Leu Thr 
20 25 30 

Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser 
35 40 45 

Trp Asp Glu Met Trp Lys Cys Leu Val Arg Leu Lys Pro Thr Leu Hi.s 
50 55 60 

Gly Pro Thr Pro Leu Leu T>-r Arg Leu Gly Pro Val Gin Asn Glu lie 
S5 70 75 80 

Cys Leu Thr His Pro lie Thr Lys Tyr lie Met: Ala Cys Met Ser 
85 90 95 

(2) INFORMATION FOR SEQ ID NO : 31: 

(i) SEQUENCE CHARACTERISTICS. 

(A) LENGTH: 401 base pairs 
(3) TYPE: nucleic acid 

(C) STP-ANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE T^/PE: CDNA 
(iil) H'i'POTHETICAL: NO 
(ill) A^m-SENSE: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: HDlO-1-25 
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(xx) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 3.. 401 



(XX) SECCETJCE DESCRIPTION: SEQ ID NO : 31: 

TC CAA AAT GAA ATC TGC TTG ACA CAC CCC GTC ACA AkA TAG ATT ATG 4 7 

Gin Asn Glu lie Cys Leu Thr Hxs Pro Val Thr lys T>-r He Met: 
15 10 15 

GCA TGC ATG TCA GCT GAT CTG GAA GTA ACC ACC AGC ACC TGG GTG TTG 9 5 

Ala Cys Mec Ser Ala Asp Leu Glu Val Thr Thr Sar Thr Trp Val Leu 
20 25 30 

CTT GGA GGG GTC CTC GCG GCC CTA GCG GCC TAG TGC TTG TCA GTC GGC 14 3 

Leu Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Val Gly 
35 40 45 

TGC GTT GTA ATC GTG GGT CAT ATC GAG CTG GGG GGC AAG CC3 GCA CTC 191 
Cys Val Val He Val Gly Hxs He Glu Leu Giy Gly Lys Pro Ala Leu 
50 55 60 

GTT CCA GAC AAG GAG GTG TTG TAT CAA CAG TAC GAT GAG ATG GAG GAG 23 9 

Val Pro Asp Lys Glu Val Leu Tyr Gin Gin Tyr Asp Glu Mec Glu Glu 
65 70 75 

TGC TCG CAA GCC GCC CCA TAC ATC GAA CAA GCT CAG GTA ATA GCC CAC 23 7 

Cys Ser Gin Ala Ala Pro Tyr He Glu Gin Ala Gin Val He Ala Hxs 
80 35 90 95 

CAG TTC AAG GAG AAA ATC CTT GGA CTG CTG CAG CGA GCC ACC O^A CAA 33 5 

Gin Phe Lys Glu Lys He Leu Gly Leu Leu Gin Arg Ala Thr Gin Gin 
100 105 110 

CAA GCT GTC ATT GAG CCC GTA ATA GCT TCC AAC TGG CAA AAG CTT GAA 33 3 

Gin Ala Val He Glu Pro Val Ha Ala Ser Asn Trp Gin Lys Leu Glu 
115 120 125 

ACC TTC TGG CAC AAG CAT 401 
Thr Phe Trp Hxs Lys Hxs 
130 



(2) INFORMATION FOR SEQ ID NO : 32: 

(i) SEQUEN'CE CHARACTERISTICS: 

(A) LENGTH. 133 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) MOLECULE T{P^ . procein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 32: 

Gin Asn Glu He Cys Leu Thr Hxs Pro Val Thr Lys Tyr He Met Ala 
15 10 15 
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Cys Met Ser Ala Asp Leu Glu Vai TJir Thr Ser Thr Trp Val Leu Leu 
20 25 30 

Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Val Gly Cys 
35 40 45 

Val Val lie Val Gly His lie Glu Leu Gly Gly Lys Pro Ala Leu Val 
SO 55 60 

Pro Asp Lys Glu Val Leu Ty"=" Gin Gin Tyr Asp Glu Met Glu Glu Cys 
65 70 75 80 

Ser Gin Ala Ala Pro Tyr lie Glu Gin Ala Gin Val lie Ala His Gin 
35 90 95 

Phe Lys Glu Lys lie Leu Gly Leu Leu Gin Arg Ala Tlir Gin Gin Gin 

100 105 110 

Ala Val He Glu Pro Val He Ala Ser Asn Trp Gin Lys Leu Glu Thr 

115 120 125 

Phe Trp His Lys H:ls 
130 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SFQUHNC^: CHARACTZRISTICS : 

(A) LHNGTH: 401 base paz^rs 
(3) T^^?£ : nucleic acid 

(C) STRANDHDNSSS : single 

(D) TOPOLOGY: linear 

(ii) MOLEC-JL.E T^rPS: CONA 

(iii) hy?ot:-:htical: NO 

(iil) ANTI -SFNSZ: NO 



(vii) IMMEDIATE SOL-RCH: 

(B) CLONE: HD10-1*3 

(ix) FEATu-RE: 

(A) NAME /KEY- CDS 

(B) LOCATION: 3 - ,401 



(xi) SEQUENCE DESCRIPTION: SZQ ID NO: 33: 

TC CAA AAT GA.^^. ATC TGC TTG ACA CAC CCC GTC ACA AAA TAG ATT ATG 
Gin Asn Glu lie Cys Leu Thr His Pro Val Thr Lys Tyr lie Met: 
15 10 15 

GCA TGC ATG TCA GCT GAT CTG GAA -^TA ACC ACC AGC ACC TGG GTG TTG 
Ala Cys Met Ser Ala Asp Leu Glu Val Thr Thr Ser Thr Trp Val Leu 
20 25 30 

CTT GGA GGG GTC CTC GCG GCC CTA GCG GCC TAC TGC TTG TCA GTC GGC 
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Leu Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Val Gly 
35 40 45 

TGC GTT GTA ATC GTG GGT CAT ATC GAG CTG GGG GGC AAG CCG GCA CTC 191 
Cys Val Val He Val Gly H-s He Glu Leu Gly Gly Lys Pro Ala Leu 
50 55 60 

GTT CCA GAC AAG CAG GTG TTG TAT CAA CAG TAG GAT GJUG ATG GAG GAG 23 3 

Val Pro Asp Lvs Glu Val Leu Tyr Gin Gin Tyr Asp Glu Mec Glu Glu 
65 ' ^ 70 75 

TGC TCG CAA GCC GCC :CA TAC ATC GAA CAA GCT CAG GTA ATA GCC CAC 23? 
Cys Ser Gin Ala Ala Pro T-yr lie Glu Gin Ala Gin Val He Ala His 
30 35 90 95 

CAG TTC AAG GAG AAA ATC CTT GGA CTG CTG CAG CGA GCC ACT CAA CAA 33 5 

Gin Phe Lys Glu Lys He Leu Gly Leu Leu Gin Arg Ala Thr Gin Gin 
100 105 110 



CAA GCT GTC ATT GAG CCC GTA ATA GCT TCC AAC TGG CAA AAG CTT GAA 

Gin Ala Val He Glu Pro Val He Ala Ser Asn Trp Gin Lys Leu Glu 
115 120 125 

ACC TTC TGG CAC AAG CAT 

Thr Phe Trp Kis Lys H^s 
130 



(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE C>:A5ACTERISTICS : 
(A) LENGTH. 133 amino acids 
(3) T'/PE : ain-no acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE rxPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Gin Asn Glu He Cvs Leu Thr His Pro Val Thr Lys Tyr He MeC Ala 
15 10 15 

Cvs Met ser Ala Asd Leu Glu Val Thr Thr Ser Thr Trp Val Leu Leu 
20 25 30 

Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Val Gly Cys 
35 40 45 

Val Val He Val Gly His He Glu Leu Gly Gly Lys Pro Ala Leu Val 
^50 '55 60 



Pro Asp Lys Glu Val Leu Tyr Gin Gin Tyr Asp Glu Met Glu Glu Cys 
65 



70 75 80 



Ser Gin Ala Ala Pro Tyr He Glu Gin Ala Gin Val He Ala His Gin 
85 90 95 

Phe Lys Glu Lys He Leu Gly Leu Leu Gin Arg Ala Thr Gin Gin Gin 



3S3 



401 
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100 105 110 

Ala Val He Glu Pro Val lie Ala Ser Asn Trp Gin Lys Leu Glu Thr 
115 120 125 

Phe Tr? His Lys His 
130 

(2) INFORMATION FOR SEQ ID NO : 35: 

(i) SEQUZ:TCZ CHARACTERISTICS: 

(A) LZNGTH: 4 01 base pairs 

(B) TYPE: nucleic acid 
(CI STRANDEONESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECJLE r/PE : cDNA 

( i 1 i ) KYPOTHET I CAL : NO 

(iii) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE: 

(3} CLONE: 3R3d-20-164 

(ix) FEATURE: 

(A) NAME/y-EY: CDS 

(B) LOCATION: 3 . .401 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

TC CAA AAT GA.A ATC TGC TTG ACA CAC CCC ATC ACA AAA TAC ATC ATG 
Gin Asn Giu He Cys Leu Thr His Pro lis Thr Lys Tyr He Mer 
1 5 10 15 

GCA TGC ATG TCA GCT GAT CTG GAA GTA ACC ACC AGC ACC TGG GTT TTG 
Ala CVS MeL Ser Ala Aso Leu Glu Val Thr Thr Ser Thr Trp Val Leu 
20 25 30 

CTT GGA GGG GTC CTC GCG GCC CTA GCG GCC TAC TGC TTG TCA GTC GGT 14 3 

Leu Gly Glv Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Val Gly 
35 40 45 



TGT GTT GTG ATT GTG GGT CAT ATC GAG CTG GGG GGC AAG CCG GCA ATC 
Cys Val Val He Val Gly Kis He Glu Leu Gly Gly Lys Pro Ala He 
50 55 60 

GTT CCA GAC AAA GAG GTG TTG TAT CAA CA.A TAC GAT GAG ATG GAA GAG 
Val Pro As? Lys Glu Val Leu Tyr Gin Gin Tyr As? Glu Met Giu Glu 
65 ' 70 75 

TGC TCA CAA GCT GCC CCA TAT ATC GAA CA.A GCT CAG GTA ATA GCT CAC 
cys Ser Gin Ala Ala Pro Tyr He Glu Gin Ala Gin Val He Ala His 
80 85 90 9= 

CAG TTC AAG GGA AAA GTC CTT GGA TTG CTG CAG CGA GCC ACC CAA CAA 



47 



95 



191 



239 



237 



335 
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Gin Phe Lys Gly Lys Val Leu Gly Leu Leu Gin Arg Ala Thr Gin Gin 
100 105 110 

CAA GCT GTC ATT GAG CCC ATA GTA ACT ACC AAC TGG CAA AAG CTT GAG 3 83 

Gin Ala Val lie Glu Pro He Val Thr Thr Asn Trc Gin Lys Leu Glu 

115 120 125 

GCC TTT TGG CAC AAG CAT 401 

Ala Phe Trp niLS Lys His 
130 



(2) INFORMATION FOR SEQ ID NO : 36: 

(i) SZQL^CE CHARACTERISTICS: 

(A) LENGTH: 13 3 arnxno acids 

(B) TYPE: amino acid 
{D} TOPOLOGY: linear 

(li) MOLECULE TYPE: proce:Ln 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3o : 

Gin Asn Glu lie Cys Leu Thr His Pre He Thr Lys Tyr He Met Ala 
15 10 15 

Cys Met Ser Ala Asp Leu Glu Val Thr Thr Ser Thr Trp Val Leu Leu 
20 25 30 

Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Val Gly Cys 
35 40 45 

Val Val He Val Gly His He Glu Leu Gly Gly Lys Pro Ala He Val 
50 55 60 

Pro Asp Lys Glu Val Leu Tyr Gin Gin Tyr Asp Glu Met Glu Glu Cys 
65 70 75 80 

Ser Gin Ala Ala Pro Tyr He Glu Gin Ala Gin Val He Ala His Gin 
85 90 95 

Phe Lys Gly Lys Val Leu Gly Leu Leu Gin Arg Ala Thr Gin Gin Gin 
100 105 110 

Ala Val He Glu Pro He Val Thr Thr Asn Trp Gin Lys Leu Glu Ala 
115 120 125 

Phe Trp His Lys His 
130 

(2) INFORMATION FOR SEQ ID NO : 37: 

H) SEQUENCE CH-\RACTERISTICS : 

(A) LENGTH: 401 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ri) MOLECULE TT?£ : cDNA 
(ill) HYPOTHETICAL: NO 
(lil) ANTI-SENSE: NO 



(vii} IMMEDIATE SOURCE: 

(3) CLONE: BR36-20-I66 

(ix) FEATURE: 

(A) NAME/KTiT: CDS 

(B) LOCATION: 3 . . 401 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

TC CAA AAT GAA ATC TGC TTG ACA CAC CCC ATC ACA AAA TAC ATC ATG 
Gin Asn Giu He Cys Leu Thr His Pro He Thr Lys Tyr lie Mer 
1 5 10 * ' 15 

GCA TGC ATG TCA GCT GAT CTG GA-A GTA ACC ACC AGC ACC TGG GTT TTG 
Ala Cys Met Ser AJ.a Asp Leu Giu Val Thr Thr Ser Thr Trp Vai Leu 
20 25 * 50 

CTT GGA GGG GTC CTC GCG GC- CTA GCG GCC TAC TGC TTG TCA GTC GGT 
Leu Giy Giy Vai Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Vai Giy 
35 40 45 

TGT GTT GTG ATT GTG GGT CAT ATC GAG CTG GGG GGC AAG CCG GCA ATC 
Cys Vai Val He Vai Giy His lie Giu Leu Giy Giy Lys Pro Aia lie 
50 55 ' SO 



14 



GTT CCA GAC AAA GAG GTG TTG TAT CAA CAA TAC GAT GAG ATG GAA GAG 
Vai Pro Asp Lys Giu Vai Leu Tyr Gin Gin Tyr Asp Giu Met Giu Giu 
65 70 75 



23 



TGC TCA CAA GCT GCC CCA TAT ATC GAA CAA GCT CAG GTG ATA GCT CAC 
Cys Ser Gin Aia Ala Pro Tyr He Giu Gin Aia Gin Vai He Aia His 
80 85 90 9S 

CAG TTC AAG GAA AAA GTC CTT GGA TTG CTG CAG CGA GCC ACC CAA CAA 
Gin Phe Lys Giu Lys Val Leu Giy Leu Leu Gin Arg Ala Thr Gin Gin 
100 105 110 

CAA GCT GTC ATT GAG CCC ATA GTA ACT ACC AAC TGG CAA AAG CTT GAG 
Gin Ala Vai He Giu Pro lie Vai Thr Thr Asn Trp Gin Lys Leu Giu 
115 120 125 



33 



33 



GCC TTT TGG CAC AAG CAT 
Aia Phe Trp His Lys His 
130 



(2) INFORMATION FOR SEQ ID NO : 38. 



(i) SEQUENCE CHARACTE^ilSTICS : 
(A) LENGTH: 13 3 amine acids 
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(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

{ii} MOLECULE TYPE: procein 

(xi) SHQUENCi DESCRIPTION: SEQ ID NO : 33: 

Gin Asn Glu lie Cys Leu Thr His Pro lie Thr Lys T;--r lie Mez Ala 
IS 10 15 

Cys Met Ser Ala Asp Leu Glu Val Thr Thr Ser Thr Trp Val Leu Leu 

20 25 30 

Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Val Gly Cys 
35 40 45 

Val Val lie Val Gly His lie Glu Leu Gly Gly Lys Pro Ala lie Val 
50 55 60 

Pro Asp Lys Glu Val Leu Tyr Gin Gin T^^-r Asp Glu Met Glu Glu Cys 
65 70 75 80 

Ser Gin Ala Ala Pro Tyr He Glu Gin Ala Gin Val He Ala H:ls Gin 
85 90 95 

Phe Lys Glu Lys Val Leu Gly Leu Leu Gin Arg Ala Thr Gin Gin Gin 
100 105 110 

Ala Val He Glu Pro He Val Thr Thr Asn Trp Gin Lys Leu Glu Ala 
115 120 125 

Phe Trp His Lys His 
130 

{2) INrORMATION FOR SEQ ID NO : 39: 

(i) SEQUENCE CHARACTERISTICS. 

(A) LENGTH: 401 base pairs 

(B) T'/PE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: BR36-20-1d5 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 3 . .401 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 39: 
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TC CAA AAT GAA ATC TGC TTG ACA CAC CCC ATC ACA AAA TAG ATC ATG 4 7 

Gin Asn Glu lie Cys Leu Thr His Pro lie Thr Lys Tyr lie Met: 
15 10 15 

GCA TGC ATG TCA GCT GAT CTG GAA GTA ACC ACC AGC ACC TGG GTT TTG 9 5 

Ala Cys Me:: Sar Ala Asp Leu Glu Val Thr Thr Ser Thr Trp Vai Leu 
20 25 3G 

CTT GGA GGG GTC CTC GCG GCC CTA GC3 GCT TAG TGC TTG TCA GTC GGT 14 3 

Leu Gly Giy Vai Leu Aia Ala Leu Ala Ala Tyr Cys Leu Ser Val Gly 
35 40 45 

TGT GTT GTG ATT GTG GGT CAT ATC GAG CTG GGG GGC AAG CCG GCA ATC 191 
Cys Val Val He Val Gly His He Glu Leu Gly Gly Lys Pro Ala lie 
50 55 60 

GTT CCA GAC AAA GAG GTG TTG TAT CAA CAA TAC GAT GAG ATG GAA GAG 23 9 

Val Pro Asp Lys Glu Val Leu Tyr Gin Gin Tyv Asp Glu Me- Glu Glu 
65 70 75 

TGC TCA CAA GCT GCC CC\ TAT ATC GAA CAA GCT CAG GTA ATA GCT CAC 2 87 

Cys Ser Gin Ala Ala Pro Tyr lie Glu Gin Ala Glr. Val He Ala His 
80 85 90 95 

CAG TTC AAG GAA AAA GTC CTT GGA TTG CTG CAG CGA GCC ACC CAA CAA 33 5 

Glrx Phe Lys Glu Lys Val Leu Gly Leu Leu Gin Arg Ala Thr Gin Gin 
100 105 110 

CAA GCT GTC ATT GAG CCC ATA GTA ACT ACC AAC TGG CAA AAG CTT GAG 3 83 

Gin Ala Val He Glu Pro He Val Thr Thr Asn Trp Gin Lys Leu Glu 
115 120 125 

GCC TTT TGG CAC AAG CAT 401 
Ala Phe Trp His Lys Kxs 
130 



(2) INFORMATION FOR SEQ ID NO : 40: 

(i) SEQUENC3 CHARACTERISTICS: 

(A) LENGTH: 13 3 amino acids 

(B) TYPE: ammo acid 
(D) TOPOLOGY: linear 

tii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

Gin Asn Glu He Cys Leu Thr His Pro He Thr Lys Tyr He Met Ala 
IS 10 15 

Cys Met Ser Ala Asp Leu Glu Val Thr Thr Ser Thr Trp Val Leu Leu 
20 25 30 

Gly Giy Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Val Gly Cys 
35 40 45 
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Val Val rie Val Giy His lie Glu Leu Giy Gly Lys Pro Ala lie Val 
SO 55 60 

Pro Asp Lys Glu Val Leu Tyr Gin Gin Tyr Asp Glu Met Glu Glu Cys 
65 70 75 80 

Ser Gin Ala Ala Pro Tyr lie Glu Gin Ala Gin Val He Ala His Gin 
85 9C 95 

Phe Lys Glu Lys Val Lau Gly Leu Leu Gin Arg Ala Thx Gin Gin Gin 
100 ^ 105 110 

Ala Val lie Glu Pro He Val Thr Thr Asn Trp Gin Lys Leu Glu Ala 
115 120 125 

Phe Trp His Lys His 
130 



(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUEJTCH CHAPJkCTZRISriCS : 

(A) LENGTH: 50 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNES3 : Single 

(D) TOPOLOGY: linear 

(ii) MOLECtlLE T^^TPE : cCNA 
(iii) HYPOTHETICAl. : NO 
(ill} ANTI-SENSE: NO 



(vii) IMMEDIATE SOLACE: 
(B) CLO^TE: PC-2-1 

(ix) FEATURE: 

(A) NANE/KEY: CDS 

(B) LOCATION: 3 . . 509 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

CC ATG AGC ACG AAT CCT AAA CCT CAA AGA A\A ACC AAA AGA AAC ACC 4 
Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr 
15 10 15 

AAC CGT CGC CCA CAG GAC GTC AAG TTC CCG GGC GGT GGT CAG ATC GTT 9 
Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He Val 
20 25 ' 30 

GGC GGA GTT TAC TTG TTG CCG CGC AGG GGC CCT AGG ATG GGT GTG CGC I'i 
Gly Giy Val Tyr Leu Leu Pro Arg~Arg Gly Pro Arg Met: Glv Val Arg 
35 40 45 

GCG ACT CGG AAG ACT TCG GA.=. CGG TCG CAA CCC CGT GGA CGG CGT CAG 19 
Ala Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin 
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50 55 60 

CCT ATT CCC AAG GCG CGC CAG CCC ACG GGC CGG TCC TGG GGT CAA CCC 23 9 

Pro He Pro Lys Ala Arg Gin Pro Thr Gly Arg Ser Trp Gly Gin Pro 
65 70 75 

GGG TAG CCT TGG CCC CTT TAC GCC AAT GAG GGC CTC GGG TGG GCA GGG 287 
Gly Tyr Pro Trp Pro Leu Tyr Ala Asn Glu Gly Leu Gly Trp Ala Gly 
aO 85 90 95 

TGG CTG CTC TCC CCT CGA GGC TCT CGG CCT AAT TGG GGC CCC AAT GAC 33 5 

Trp Leu Leu Ser Pro Arg Gly Ser Arg Pro Asn Trp Gly Pro Asn Aso 
100 105 110 

CCC CGG CGA AAA TCG CGT AAT TTG GGT AAG GTC ATC GAT ACC CTA ACG 3 83 

Pro Arg Arg Lys Ser Arg Asn Leu Gly Lys Val He Asp Thr Leu Thr 
115 120 125 

TGC GGA TTC GCC GAT CTC ATG GGG TAT ATC CCG CTC GTA GGC GGC CCC ' 431 
Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Gly Pro 
130 135 140 

ATT GGG GGC GTC GCA AGG GCT CTC GCA CAC GGT GTG AGG GTC CTT GAG 4 79 

lie Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu 
145 150 155 

GAC GGG GTA AAC TAT GCA ACA GGG AAT TTA 5 09 

Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu 
160 165 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 15 9 amino acics 
(3) r/PE: amir.a acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: p rot: em 

(xi} SEQUENCE DESCRIPTION; SEQ ID NO: 42: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
15 10 15 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Met Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

He Pro Lys Ala Arg Gin Pro Thr Gly Arg Ser Trp Gly Gin Pro Gly 
65 70 75 80 

Tyr Pro Trp Pro Leu Tvrr Ala Asn Glu Gly Leu Gly Trp Ala Gly Trp 
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85 90 95 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Asn Trp Gly Pro Asn Asp Pro 
100 105 110 

Arg Arg Lys Ser Arg Asr. Leu Gly Lys Val lie Asp Thr Leu Thr Cys 
115 120 125 

Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Gly Pro He 
130 135 140 

Gly Gly Val Ala Arg Ala leu Ala K-s Gly Val Arg Val Leu Glu Aso 
145 ISO 155 160 

Gly Val Asn Tyr Ala Thr Gly Asn Leu 

155 

(2) INrCRT^ATICN FOR SEQ ID NO:43: 

(i) SEQUiNC^: CHAPJ^CTERISTICS : 

(A) LENGTH; 5 05 base pairs 

(B) TYPE: nucleic acid 

(C) STRANTEZNZSS : single 

(D) TOPCLOGV: linear 

(il) MOLECULE T{^Z : CDNA 

(iii) hy?ot:-:eticai: no 

(iii) ANTI-SENSE: NO 



(vii) IMMEDIATE SOL'P.CE: 
(3) CLONE: ?C-2-b 

(ix) FEATURE: 

(A) NAiME/KEY; COS 

(B) LOCATION: 3 . .509 



(xi) SEQL-ENCS DESCRIPTION: SSQ ID NO: 43: 

CC ATG AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC AAA AGA AAC ACC 4 7 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr 
15 10 15 

AAC CGT CGC CCA CAG GAC GTC AAG TTC CCG GGC GGT GGT CAG ATC GTT 9 5 

Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He Val 
20 25 30 

GGC GGA GTT TAC TTG TTG CCG CGC AGG GGC CCT AGG ATG GGT GTG CGC 14 3 

Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Met Gly Val Arg 
35 40 45 

GCG ACT CGG AAG ACT TCG G.^A CGG TCG CAA CCC CGT GGA CGG CGT CAG 191 
Ala Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin 
50 55 60 
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CCT ATT CCC AAG GC3 CGC CAG CCC ACG GGC CGG TCC TGG GGT CAA CCC 23 9 

Pro He Pro Lys Ala Arg Gin Pro Thr Gly Arg Ser Trp Gly Gin Pro 
65 70 75 

GGG TAG CCT TGG CCC CTT TAC GCC AAT GAG GGC CTC GGG TGG GCA GGG 287 
Gly Tyr Pro Trp Pro Leu Tyr Ala Asn Glu Gly Leu Glv Trp Ala Gly 
30 85 90 * 95 



TGG CTG CTC TCC CCT CGA GGC TCT CGG CCT AAT TGG GGC CCC AAT GAC 
Trp Leu Leu Ser Pro Arg Gly Ser Arg Pro Asn Tr? Gly Pro Asn Asp 
100 105 * 110 



GAC GGG GTA AAC TAT GC\ ACA GGG AAT TTA 
Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu 
ISO 165 



(2) INFORMATION FOR SEQ ID NO: 44: 

(l) SFQUSNCE CHARACTERISTICS: 
(A) LENGTH: 16 9 amrino acids 
(3) TYPE: amino acid 
{D) TOPOLOGY: linear 

(li) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
15 10 15 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Met Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

He Pro Lys Ala Arg Gin Pro Thr Gly Arg Ser Trp Gly Gin Pro Gly 
65 70 75 80 

Tyr Pro Trp Pro Leu Tyr Ala Asn Glu Gly Leu Gly Trp Ala Gly Trp 
85 90 95 



335 



CCC CGG CGA AAA TCG CGT AAT TTG GGT AAG GTC ATC GAT ACC CTA ACG 3 33 

Pro Arg Arg Lys Ser Arg Asn Leu Gly Lys Val He Asp Thr Leu Thr 
115 120 " 125 

TGC GGA TTC GCC GAT CTC ATG GGG TAT ATC CCG CTC GTA GGC GGC CCC 431 

Cys Gly Phe Ala Asp Leu Mez Gly Tyr He Pro Leu Val Gly Gly Pro 

130 135 140 

ATT GGG GGC GTC GCA AGG GCT CTC GCA CAC GGT GTG AGG GTC CTT GAG 4 79 

He Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Ar- Val Leu Glu 
145 150 15- 



509 
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Leu Leu Ser Pro Arg Gly Ser Arg Pro Asn Trp Gly Pro Asn Asp Pro 

105 110 

Arg Arg Lys Ser Arg Asn Leu Gly Lys Val He Asp Thr Leu Thr Cys 
115 120 ' 125 

Gly Phe Ala Asp Leu Mec Gly Tyr He Pro Leu Val Gly Gly Pro He 
130 125 140 

Gly Gly Val Ala Arg Ala leu Ala His Gly Val Arg Val Leu Glu Asu 

155 ISO 

Gly Val Asn Tyr Ala Ttir Gly Asn Leu 
165 



(2) INFORMATION FOR SFQ ID NO: 45: 

(i) SEQUCTCS CHA^iACTZRISTICS : 

(A) LENGTH: 530 base pairs 

(B) TYPE: nucle-c acid 

( C ) STRANDEDNcS S : s ingl e 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cSNA 
(iii) HYPOTKETIC^lL: NO 
(iii) ANTI-SENSE; NO 



(vii) IMMEDIATE SOO'RCE ; 

(B) CLONE: PC-4-1 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
(3) LOCATION: 2 . . 530 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

A ACG TGC GGA TTC GCC GAT CTC ATG GGG TAT ATC CCG CTC GTA GGC 4 6 

Thr Cys Gly Phe Ala Asp Leu Mer Gly Tyr He Pro Leu Val Gly 
15 10 15 

GGC CCC ATT GGG GGC GTC GC.=. AGG GOT CTC GCA CAC GGT GTG AGG GTC 94 
Gly Pro He Gly Gly Val Ala Arg Ala Leu Ala Kis Gly Val Arg Val 
20 25 ^ 30 

CTT GAG GAC GGG GTA AAC TAT GCA ACA GGG AAT TTA CCC GGT TGC TCT 14 2 

Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser 
35 40 45 

TTC TCT ATC TTT ATT CTT GCT CTT -CTC TCG TGT CTG ACC GTT CCG GCC 190 
Phe Ser He Phe He Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala 
50 55 60 

TCT GCA GTT CCC TAC CGA AAT GCC TCT GGG ATT TAT C\T GTT ACC AAT 238 
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Ser Ala Val Pro Tyr Arg Asn Ala Ser Gly lie T-yr Hxs Val Thr Asn 

65 70 75 

GAT TGC CCA AAC TCT TCC ATA GTC TAT GAG GCA GAT AAC CTG ATC CTA 286 

Asp Cys Pro Asn Ser Ser lie Val Tyr Glu Ala Asp Asn Leu lie Leu 

80 85 90 95 

CAC GCA CCT GGT TGC GTG CCT TGT GTC ATG ACA GGT AAT GTG AGT AGA 334 

H^s Ala Pro Gly Cys Val Pro Cys Val Mec Thr Gly Asn Val Ser Arg 

100 105 110 

TGC TGG GTC CAA ATT ACC CCT ACA CTG TCA GCC CTG AGC CTC GGA GCA 3 32 

Cys Trp Val Gin He Thr Pro Thr Leu Ser Ala Pro Ser Leu Gly Ala 

lis 120 125 

GTC ACG GCT CCT CTT CGG AGA GCC GTT GAC TAC CTA GCG GGA GGG GCT 43 0 

Val Thr Ala Pro Leu Arg Arg Ala Val Asp Tyr Leu Ala Gly Gly Ala 

130 135 140 

GCC CTC TGC TCC GCG TTA TAC GTA GGA GAC GCG TGT GGG GCA CTA TTC 473 

Ala Leu Cys Ser Ala Leu T-/r Val Gly Asp Ala Cys Gly Ala Leu ?he 

145 ISO 155 

TTG GTA GGC CAA ATG TTC ACC TAT AGG CCT CGC CAG CAC GCT ACG GTG 525 

Leu Val Gly Gin Met Phe Thr Tyr Arg Pro Arg Gin His Ala Thr Val 

160 165 170 175 

CAG AAC TGC AAC TGT TCC ATT TAC AGT GGC CAT GTT ACC GGC CAC CGG 5 74 

Gin Asn Cys Asn Cys Ser He Tyr Ser Gly Kis Val Thr Gly His Arg 

180 195 190 

ATG GCA 55 0 
Mec Ala 



(2) INFORMATION FOR SHQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 193 amino acids 
(3) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLE'CULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Gly 
15 10 15 

Pro lie Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu 
20 25 30 

Glu Asp Gly Val Asn Tyr Ala Thr -Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

Ser lie Phe lie Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 55 60 
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Ala Vai Pro Tyr Arg Asn Ala Ser Gly lie Tyr Hxs Val Thr Asn Asp 
65 70 75 80 

Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp Asr. Leu He Leu His 
85 90 95 

Ala Pro Gly Cys Val Pro Cys Val yiez Thr Gly Asn Val Ser Arg Cys 
100 105 110 

Trp Val Gin lie Tiir Pro Thr Leu Ser Ala Pro Ser Leu Gly Ala Val 
115 120 125 

Thr Ala Pro Leu Arg Arg Ala Val As? Tyr Leu Ala Gly Gly Ala Ala 
130 135 140 

Leu Cys Ser Ala Leu Tyr Val Gly Asp Ala Cys Gly Ala Leu Phe Leu 
145 150 155 160 

Val Gly Gin Met: Phe Thr Tyr Arg Pro Arg Gin His Ala Thr Val Gin 
165 170 175 

Asn Cys Asn Cys Ser He T\'r Ser Gly His Val Thr Gly H:ls Arg Mec 
130 1B5 190 

Ala 



(2) INrORMATION FOR SEQ ID NO : 47: 

(l) SSQUHNCE CHARACTERISTICS: 

(A) LENGTH: 58 0 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS : single 
, (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE: 
(B) CLONE: PC-4-5 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2 , . 580 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 47: 

A ACG TGC GGA TTC GCC GAT CTC ATG GGG TAT ATC CCG CTC GTA GGC 4 6 

Thr Cys Gly Phe Ala Asp Leu Mec Gly Tyr He Pro Leu Val Gly 
15 10 15 

GGC CCC ATT GGG GGC GTC GCA AGG GCT CTC GCA CAC GGT GTG AGG GTC ^'^ 
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Gly Pro lie Gly Gly Val Ala Arg Ala Leu Ala :--3 Glv Vai Arg Val 
20 25 * 30 

CTT GAG GAC GGG GTA AAC TAT GCA AC=. GGG AAT TTA CCC GGT TGC TCT 142 

Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Glv Cys Se- 
35 4G * * 



45 



TTC TCT ATC TTT ATT CTT GCT CTT CTC TCG T::T C"3 3— 
?he Ser He Phe He Leu Ala Leu Leu Ser C-zs Leu Thr Va^ P^o Ala 
50 55 * SO " " 



TTG GTA GGC CAA ATG TTC ACC TAT AGG CCT CGC CAG CAC GCT ACG GTG 
Leu Val Gly Gin Mec Phe Thr Tyr Arg Pro Arg Gin His Ala Thr Val 
155 170 175 



ATG GCA 
Met Ala 



190 



TCT GCA GTT CCC TAC CGA AAT GCC TCT GGG ATT TAT CAT GTT ACC AAT 23 8 

Ser Ala Val Pro Tyr Arg Asn Ala Ser Gly He T-/r His Val Thr Asn 

65 70 75 

GAT TGC CCA AAC TCT TCC ATA GTC TAT GAG GCA GAT AAC CTG ATC CTA 28 6 

Asp Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp Asr. Leu He Leu 

80 85 9Q * 



CAC GCA CCT GGT TGC GTG CCT TGT GTC ATG ACA C-GT AAT GTG AGT AGA 

His Ala Pro Gly Cys Val Pro Cys Val Mec Thr Gly As- Val Ser Arcr 

100 105 * 110 

TGC TGG GTC CAA ATT ACC CCT ACA CTG TCA GCC CCG AGC CTC GGA GCA 

Cys Trp Val Gin He Thr Pro Thr Leu Ser Ala ?r^ Ser Leu Glv Ala 

115 12c 125 

GTC ACG GCT CCT CTT CGG AGA GCC GTT GAC TAC CTA GCG -GCA GGG GCT 

Val Thr Ala Pro Leu Arg Arg Ala Val Asp T-yr ieu Ala Glv Gly Ala 

130 135 ' 140 ^ ' 

GCC CTC TGC TCC GCG TTA TAC GTA GGA GAC GCG TGT GGG GCA CTA TTC 4 73 

Ala Leu Cys Ser Ala Leu Tyr Val Gly Aso Ala C/s Glv Ala Leu Phe 

145 150 



33^ 



332 



430 



526 



CAG AAC TGC AAC TGT TCC ATT TAC AGT GGC CAT GTT ACC GGC CAC CGG 574 
Gin Asn Cys Asn Cys Ser He Tyr Ser Gly Kis Val Thr Gly Kis Arg 
130 185 ' 190 



5S0 



(2) INFORMATION FOR SEQ ID NO : 48: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 193 amino acids 
(3) TYPE: ammo acid 
(D) TOPOLOGY: linear " 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
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Thr Cys Gly Phe Ala As? Leu Me- Giy Tyr lie Pro Leu Val Gly Gly 
15 10 15 

Pro lie Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu 
20 25 30 

Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

Ser lie Phe lie Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 55 60 



Ala Val Pro Tyr Arg Asn Ala Ser 
65 70 

Cys Pro Asn Ser 



Ala Pro Gly Cys 
100 

Trp Val Gin lie 
115 

Thr Ala Pro Leu Arg Arg Ala Val 
130 135 



Gly lis Tyr His Val Thr Asn Asp 
75 80 



Asp Tyr Leu Ala Gly Gly Ala Ala 

140 



Ser He Val Tyr 
85 

Val Pro Cys Val 



Thr Pro Thr Leu 
120 



Glu Ala Asp Asn 
90 

Met: Thr Gly Asn 
105 

Ser Ala Pro Ser 



Leu lie Leu His 
95 

Val Ser Arg Cys 
110 

Leu Gly Ala Val 
125 



Leu Cys Ser Ala Leu Tyr Val Gly Asp Ala Cys Gly Ala Leu Phe Leu 
145 150 155 ISO 

Val Gly Gin Met Phe Thr Tyr Arg Pro Arg Gin His Ala Thr Val Gin 
155 170 175 

Asn Cys Asn Cys Ser He Tyr Ser Gly His Val Thr Gly His Arg Men 
130 135 190 

Ala 



(2) INFORMATION FOR SHQ ID NO : 49: 

(l) SHQUENCH CH.-..^ACTERISTICS : 

(A) LENGTH: 959 base pairs 
(3) TYPE: nucleic acid 

(C) STRA^iuEONESS : Single 

(D) TOPOLOGY: linear 

(li) MOLECULE T^iTPE : CDNA 
(iii) HYPOTHETICAL: NO 
(lii) ANTI-SENSE. NO 



(vii) IMMEDIATE SOURCE: 
(3) CLONE: PC-3-4 
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(XX) FEATURE: 

(A) NAME/KEY; CDS 
(3} LOCATION: 3- . 959 



(xi) SEQUENCE DESCmPTION: SEQ ID NO; 4S. 

CC ATG AGC ACG AAT CCT AAA CCT CAA AGA XAA ACC AA.:^ AGA AA.C ACC 4 7 

Mei £er Thr Asn Pro Lys Pro Gin Arg Lys Tiir Lys Arg Asn Thr 
15 10 15 

AAC CGT CGC CCA CAG GAC GTC AAG TTC CCG GGC GGT GGT CAG ATC GTT 95 
Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He Val 
20 23 30 

GGC GGA GTT TAC TTG TTG CCG CGC AGG GGC CCT AGG ATG GGT GTG CGC 14 3 

Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Met Gly Val Arg 
35 40 45 

GCG ACT CGG AAG ACT TCG GAA CGG TCG CAA CCC CGT GGA CGG CGT CAG 131 
Ala Thr Arg Lys Thr Ser Glu Arg Sar Gin Pro Arg Gly Arg Arg Gin 
50 55 60 

CCT ATT CCC AAG GCG CGC CAG CCC ACG GGC CGG TCG TGG GGT CAA CCC 23 9 

Pro He Pro Lys Ala Arg Gin Pro Thr Gly Arg Ser Trp Gly Gin Pro 
65 70 75 

GGG TAC CCT TGG CCC CTT TAC GCC AAT GAG GGC CTC GGG TGG GCA GGG 23 7 

Gly Tyr Pro Trp Pro Leu Tyr Ala Asn Glu Gly Leu Gly Trp Ala Gly 
80 85 90 95 

TGG CTG CTC TCC CCT CGA GGC TCT CGG CCT AAT TGG GGC CCC AAT GAC 33 5 

Trp Leu Leu Ser Pro Arg Gly Ser Arg Pro Asn Trp Gly Pro Asn Asp 
100 105 110 

CCC CGG CGA AAA TCG CGT AAT TTG GGT AAG GTC ATG GAT ACC CTA ACG 38 3 

Pro Arg Arg Lys Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu Thr 
115 120 125 

TGC GGA TTC GCC GAT CTC ATG GGG TAT ATC CCG CTC GTA GGC GGC CCC 431 
Cys Gly Phe Ala Asp Leu Mec Gly Tyr lie Pro Leu Val Gly Gly Pro 
130 135 14G 

ATT GGG GGC GTC GCA AGG GCT CTC GCA CAC GGT GTG AGG GTC CTT GAG 47 3 

He Gly Gly Val Ala Arg Ala Leu Ala H^s Gly Val Arg Val Leu Glu 
145 150 155 

GAC GGG GTA AAC TAT GCA ACA GGG AAT TTA CCC GGT TGC TCT TTC TCT 52 7 

Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser 
160 165 170 175 

ATC TTT ATT CTT GCT CTT CTC TCG TGT CTG ACC GTT CCG GCC TCT GCA 5 7 

He Phe He Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Ala 
180 135 190 

GTT CCC TAC CGA AAT GCC TCT GGG ATT TAT CAT GTT ACC A-^vT GAT TGC €2 
Val Pro Tyr Arg Asn Ala Ser Gly He Tyr His Val Thr Asn Asp Cys 
195 200 205 
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CCA AAC TCT TCC ATA GTC TAT GAG GCA GAT AAC CTG ATC CTA CAC GCA 6 71 

Pro Asn Ser Ser He Val Tyr Glu Ala Asp Asn Leu lie Leu His Ala 
210 215 220 

CCT GGT TGC GTG CCT TGT GTC ATG ACA GGT AAT GTG AGT AGA TGC TGG 719 
Pro Giy Cys Val Pro Cys Val Met: Thr Gly Asn Val Ser Arg Cys Trp 
225 233 235 

"GTC CAA ATT ACC CCT ACA CTG TCA GCC CCG AGC CTC GGA GCA GTC ACG 76 7 

Val Gin He Thr Pro Thr Leu Ser Ala Pro Ser Leu Gly Ala Val Thr 
240 245 250 255 

GCT CCT CTT CGG AGA GCC GTT GAC TAC CTA GCG GGA GGG GCT GCC CTC 315 
Ala Pro Leu Arg Arg Ala Val Asp Tyr Leu Ala Gly Gly Ala Ala Leu 
260 265 270 

TGC TCC GCG TTA TAC GTA GGA GAC GCG TGT GGG GCX CTA TTC TTG GTA . 363 
Cys Ser Ala Leu Tyr Val Gly Asp Ala Cys Gly Ala Leu Phe Leu Val 
275 290 285 

GGC CAA ATG TTC ACC TAT AGG CCT CGC CAG CAC GCT ACG GTG CAG AAC 911 
Gly Gin Mer Phe Thr T-yr Arg Pro Arg Gin His Ala Thr Val Gin Asn 
290 295 300 

TGC AAC TGT TCC ATT TAC AGT GGC CAT GTT ACC GGC CAC CGG ATG GCA 95 9 

Cys Asn Cys Ser lie Tyr Ser Gly His Val Thr Gly His Arg Mer Ala 
305 310 315 



(2) INFORMATION FOR SHQ ID NO: 50: 

(i) SEQUENCE CKAPJkCTERISTICS : 
(A) LENGTH: 319 amino acids 
(3) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE T^/PE: protiein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
15 10 15 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Giy Gin lie Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Met: Gly Val Arg* Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
SO 55 60 

lie Pro Lys Ala Arg Gin Pro Thr "Gly Arg Ser Trp Gly Gin Pro Gly 
65 70 75 80 

Tyr Pro Trp Pro Leu Tyr Ala Asn Glu Gly Leu Gly Trp Ala Gly Trp 
85 90 95 
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Leu Leu Ser Pro Arg Gly Ser Arg Pro Asn Tr? Giy ?ro Asn Aso Pro 
100 105 

Arg Arg Lys Ser Arg Asn Leu Gly Lys Val He As-o Thr Leu Thr Cys 
115 120 ' 125 

Gly Phe Ala Asp Leu Met Gly Tyr lie ?rz Leu Val Glv Gly Pro He 
130 135 140 

Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 

150 155 ISO 

Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He 

165 170 175 

Phe He Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Val 
lao 135 190 

Pro Tyr Arg Asn Ala Ser Gly He Tyr His Val Thr Asn Asp Cys Pro 
195 200 205 

Asn Ser Ser He Val Tyr Glu Ala Asp Asn Leu He Leu Kis Ala Pro 
210 215 2ZC 

Gly Cys Val Pro Cys Val Met Thr Gly Asn Val Ser Ar^ Cys Trp Val 
225 230 235 ^ ' 240 

Gin He Tihr Pro Thr Leu Ser Ala Pro Ser Leu Gly Ala Val Thr Ala 
245 250 255 

Pro Leu Arg Arg Ala Val Asp Tyr Leu Ala Gly Gly Ala Ala Leu Cys 
260 265 * 270 

Ser Ala Leu Tyr Val Gly Asp Ala Cys Gly Ala Leu Phe Leu Val Gly 
275 280 2a5 

Gin Met: Phe Thr Tyr Arg Pro Arg Gin His Ala Thr Val Gin Asn Cys 
290 295 303 

Asn Cys Ser He Tyr Ser Gly His Val Thr Gly His Arg Met Ala 
305 310 313 

(2) INFORMATION FOR S£Q ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 959 base pa^rs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(li) MOLECULE I"ir?E : CDNA 
(iii) HYPOTHETICAL: KO 
(iii) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE: 
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(B) CLONH: PC-3-9 

( ix ) FEATURE : 

(A) NAME/KHV: CDS 
(3) LOCATION: 3,. 959 



(xi) SEQUENCZ DESCRIPTION: SEQ ID NO : 51: 

CC AT3 AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC AAA AGA AAC ACC 4 7 

Mec Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr 
15 10 15 

AAC CGT CGC CCA CAG GAC GTC AAG TTC CCG GGC GGT GGT CAG ATC GTT 9 5 

Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val 
20 25 30 

GGC GGA GTT TAC TTG TTG CCG CGC AGG GGC CCT AGG ATG GGT GTG CGC^ 14 3 

Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Me- Gly Val Arg 
35 40 45 

GCG ACT CGG AAG ACT TCG GAA CGG TCG CAA CCC CGC GGA CGG CGT CAG 191 
Ala Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin 
SO 55 60 

CCT ATT CCC AAG GCG CGC CAG CCC ACG GGC CGG TCC TGG GGT CAA CCC 23 9 

Pro lie Pro Lys Ala Arg Gin Pro Thr Gly Arg Ser Trp Gly Gin Pro 
65 70 73 

GGG TAC CCT TGG CCC CTT TAC GCC AAT GAG GGC CTC GGG TGG GCA GGG 237 
Gly Tyr Pro Trp Pro Leu Tyr Ala Asn Glu Gly Leu Gly Trp Ala Gly 
80 85 90 95 

TGG CTG CTC TCC CCT CGA GGC TCT CGG CCT AAT TC-G GGC CCC AAT GAC 33 5 

Trp Leu Leu Ser Pro Arg Gly Ser Arg Pro Asn Trp Gly Pro Asn Asp 
100 105 110 

CCC CGG CGA AAA TCG CGT AAT TTG GGT AAG GTC ATC GAT ACC CTA ACG 333 
Pro Arg Arg Lys Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu Thr 
115 120 125 

TGC GGA TTC GCC GAT CTC ATG GGG TAC ATC CCG CTC GTA GGC GGC CCC 431 
Cys Gly Phe Ala Asp Leu Mer Gly Tyr lie Pro Leu Val Gly Gly Pro 
130 135 140 

GTT GGG GGC GTC GCA AGG GCT CTC GCA CAC GGT GTG AGG GTC CTT GAG 47 5 

Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu 
145 150 155 

GAC GGG GTA AAC TAT CCA ACA GGG AAT TTA CCC GGT TGC TCT TTC TCT 527 
Asp Gly Val Asn Tyr Pro Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser 
160 165 170 175 

ATC TTT ATT CTT GCT CTT CTC TCG "TGT CTG ACC GTT CCG GCC TCT GCA 575 
lie Phe lie Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Ala 
lao 185 190 

GTT CCC TAC CGA AAT GCC TCT GGG ATT TAT CAT GTT ACC AAT GAT TGC 6^3 
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Val Pro Tyr Arg Asn Ala Ser Gly lie Tyr His Vai Thr Asn Asp Cys 
155 200 205 

CCA AAC TCT TCC ATA GTC TAT GAG GCA GAT AAC CTG ATC CTA CAC GCA S 

Pro Asn Ser Ser lie Vai Tyr Giu Ala Asp Asn Leu He Leu His Ala 
210 215 220 

CCT GGT TGC GTG CCT TGT GTC ATG ACA GGT AAT GTG AGT AGA TGC TGG 7 

Pro Gly Cys Vai Pro Cys Val Mec Thr Gly Asn Val Ser Arg Cys Trp 

225 230 235 

GTC CAA ATT ACC CCT ACA CTG TCA GCC CCG AGC CTC C-C-A GCA GTC ACG 7 

Val Gin He Thr Pro Thr Leu Ser Ala Pro Ser Leu Gly Ala Val Thr 

240 245 250 ^ 255 

GCT CCT CTT CGG AGA GCC GTT GAC TAC CTA GCG GGA GGG GCT GCC CTC a 

Ala Pro Leu Arg Arg Ala Val Asp Tyr Leu Ala Gly Gly Ala Ala Leu 

260 26S 270 

TGC TCC GCG TTA TAC GTA GGA GAC GCG TGT GGG GCA CTA TTC TTG GTA 3 

Cys Ser Ala Leu Tyr Val Gly Asp Ala Cys Gly Ala Leu Phe Leu Val 
275 280 285 

GGC CAA ATG TTC ACC TAT AGG CCT CGC CAG CAC GCT ACG GTG CAG AAC 9 

Gly Gin Mec Phe Thr Tyr Arg Pro Arg Gin His Ala Thr Vai Gin Asn 
290 295 300 

TGC AAC TGT TCC ATT TAC AGT GGC CAT GTT ACC GGC CkC CGG ATG GCA 9 

C;/s Asn Cys Ser lie Tyr Ser Gly Kis Vai Thr Gly His Arg Mez Ala 

305 313 315 



(2) INFORMATION FOR SZQ ID NO : 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 319 ammo acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(li) MOLECtTLE T"/PE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
1 5 10 ' i5 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Met Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Giu Arg Ser Gin Pro Arg Giy Arg Arg Gin Pro 
50 55 60 

lie Pro Lys Ala Arg Gin Pro Thr Giy Arg Ser Trp Gly Gin Pro Giy 
65 70 75 80 
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Tyr Pro Trp Pro Leu Tyr Ala Asn Giu Gly Leu Gly Trp Ala Gly Trp 
85 90 95 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Asn Trp Gly Pro Asn Asp Pro 
100 105 110 

Arg Arg Lys Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys 
115 120 125 

Gly ?ha Ala A-s? Leu Met Gly Tyr lie Pro Leu Val Gly Gly Pro Val 
130 135 140 

Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 
145 150 155 160 

Gly Val Asn Tyr Pro Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser lie 
165 170 175 

Phe lie Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Val 
130 185 190 

Pro Tyr Arg Asn Ala Ser Gly He Tyr His Val Thr Asn Asp Cys Pro 
195 200 205 

Asn Ser Ser He Val Tyr Glu Ala Asp Asn Leu He Leu H::s Ala Pro 
210 215 220 

Gly Cys Val Pro Cys Val Met: Thr Gly Asn Val Ser Arg Cys Trp Val 
225 230 235 240 

Gin He Thr Pro Thr Leu Ser Ala Pro Ser Leu Gly Ala Val Thr Ala 
245 250 255 

Pro Leu Arg Arg Ala Val Asp Tyr Leu Ala Gly Gly Ala Ala Leu Cys 

260 265 270 

Ser Ala Leu Tyr Val Gly Asp Ala Cys Gly Ala Leu Phe Leu Val Gly 
275 280 285 

Gin Met Phe Thr Tyr Arg Pro Arg Gin His Ala Thr Val Gin Asn Cys 
290 295 300 

Asn Cys Ser He Tyr Ser Gly His Val Thr Gly His Arg Met: Ala 
305 310 315 



(2) INFORMATION FOR SEQ ID NO : 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 959 base pairs 

(B) nucleic acid 

(C) STRANDEDNES3 : single 

(D) TOPOLOGY: linear 

(li) MOLECULE T^^PK: cDNA 
(ill) HYPOTHETICAL: NO 
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(iii) ANTI-SENSE; NO 



(vii) IMMEDIATE SOURCE: 

(3) CLONE: PC C/El 

(ix) FEATURE: 

(A) NAJ'tE/KEV: ZZS 
(3) LOCATION: Z. .95 3 
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(XI) SEQUENCE DESCRIPTION: SEC 13 NO: 53: 

CCATGAGCAC GAATCCTAAA CCTCAAAGAA AAACCAAAAG AAACACCAAC CGTCGCCCAC 

AGGACGTCAA GTTCCCGGGC GGTGGTCAGA TCGTTGGCGG AGTTTACTTG TTGCCGCGCA 

GGGGCCCTAG GATGGGTGTG CGCGCGACTC GGAAGACTTC GGAACGGTCG CAACCCCGTG 

GACGGCGTCA GCCTATTCCC AAGGCGCGCC AGCCCACGGG CCGGTCCTGG GGTCAACCGG 

GGTACCCTTG GCCCCTTTAC GCCAATGAC^- GCCTCGGGTG GGCAGGGTGG CTGCTCTCCC 

CTCGAGGCTC TCGGCCT.\AT TGC-GGCGCCA ATGACGCCCG GCGAAAATCG CGTAATTTGG 

GTAAGGTCAT CGATACCCTA ACGTGCGGAT TCGCCGATCT CATGGC-GTAY ATCCCGCTCG 

TAGGCGGCCC CRTTGGGGGC GTCGCAAC-C-G CTCTCGCACA CGGTGTGAGG GTCCTTGAGG 

ACGGGGTAAA CTATSCAACA GGC-AATTTAC CCGGTTGCTC TTTCTCTATC TTTATTCTTG 

CTCTTCTCTC GTGTCTGACC GTTGIGGCGT CTGCAGTTCC CTACCGA.=JVT GCCTCTGGGA 

TTTATCATGT TACCAATGAT TGCGCAAACT CTTCCATAGT CTATGAGGCA GATAACCTGA 

TCCTACACGC ACCTGGTTGC GTGGCTTGTG TCATGACAGG TAATGTGAGT AGATGCTGGG 

TCCAAATTAC CCCTACACTG TCAGCCCGGA GCCTCGGAGC AGTCACGGCT CCTCTTCGGA 

GAGCCGTTGA CTACCTAGCG GGAC-GGGCTG CCCTCTGCTC CGCGTTATAC GTAGGAGACG 

CGTGTGGGGC ACTATTCTTG GTAC-GCCAAA TGTTCACCTA TAGGCCTCGC CAGCACGCTA 

CGGTGCAGAA CTGCAACTGT TCCATTTACA GTGGCCATGT TACCGGCCAC CGGATGGCA 

{2} INFORMATION FOR SEQ 1^ NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 319 aT.:i.no acids 
(3) TYPE: amino acid 
(D) TOPOLOGY: lir.sar 

til) MOLECULE r^?E: prczeir. 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54- 

Mec Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
15 10 15 



SO 
120 
130 
240 
300 
350 
420 
430 
540 
600 

eso 

720 
730 
840 
900 
959 
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Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pre Ar^ Arg Gly Pro Arg Met Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Glv Ar:T Ar=r G^r ^-o 
50 55 60 *^ " " 

lie Pro Lys Ala Arg Gin Pro Thr Gly Arg Ser Trp Gly Gin Pro Gly 
^5 70 75 * * 80 

Tyr Pro Trp Pro Leu Tyr Ala Asn Glu Gly Leu Gly Trp Ala Gly Trp 
85 90 ' 9c 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Asn Trp Gly Pro Asn Asp Pro 
100 105 -^^Q 

Arg Arg Lys Ser Arg Asn Leu Gly Lys Val He Asp Thr Leu Thr Cys 
115 120 * 125 

Gly Phe Ala Asp Leu Me:: Gly Tyr He Pro Leu Val Glv Glv Pro Val 
130 135 140 

Gly Gly Val Ala Arg Ala Leu Ala K:.s Gly Val Arg Val Leu Glu Asn 
145 150 1S5 ISO 

Gly Val Asn Tyr Pro Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He 
165 170 ' 175 

Phe lie Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Val 
190 1S5 190 

Pro Tyr Arg Asn Ala Ser Gly lis Tyr His Val Thr Asn Asp Cys Pro 
195 200 205 

Asn Ser Ser He Val Tyr Glu Ala Asp Asn Leu He Leu His Ala Pro 
210 215 220 

Gly Cys Val Pro Cys Val Met Thr Gly Asn Val Ser Arg Cvs Tro Val 
225 230 235 * " 240 

Gin He Thr Pro Thr Leu Ser Ala Pro Ser Leu Gly Ala Val Thr Ala 
245 230 255 

Pro Leu Arg Arg Ala Val Asp Tyr Leu Ala Gly Gly Ala Ala Leu Cys 
260 265 270 

Ser Ala Leu Tyr Val Gly Asp Ala Cys Gly Ala Leu Phe Leu Val Gly 
275 230 285 

Gin Met Phe Thr Tyr Arg Pro Arg Gin Kis Ala Thr Val Gin Asn Cys 
290 295 300 

Asn Cys Ser He Tyr Ser Gly Kis Val Thr Gly His Arg Met Ala 
305 310 315 
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(2) INFORMATION FOR SEQ ID NO : 55: 

(l) SEQUENCE CI-iARACTESISTICS : 

(A) LENGTH: 3 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANEEZ^-ESS : s:.ngle 

(D) TOPOLOGY: 1: 



(l.) MOLECTJLE : cCNA 

HYP0T:-™ICAL: NO 
(ill) ANTI-SE:TSE: NO 



(vii) IM^4EDIATE SOURCE: 

(B) CLONE: PC-1-37 

( IX ) FEATURE : 

(A) NAME/KEY: COS 

(B) LOCATION: 1 . , 354 



(xi) SEQL^NCE DESCRIP-TION: SEQ ID NO: 55: 

ACCACCGGAG CTTCTATCAC ATACTCCACT TACGGCAAGT TCCTTGCTGA TGGAGGGTGT 6 0 

TCAGGCGGCG CGCATGACGT GATCATATGC GACGAGTGCC ATTCCCAGGA CGCCACCACC 12 0 

ATTCTTGGGA TAGGCACTGT CCTTGACCAG GCAGAGACGG CTGGAGCTAG GC'^CG'VCGZIC 19 0 

TTGGCCACGG NCACCCZTCC CGGCAGTGTG ACAACGCCCC ACCCCAACAT CGAGGAAGTG 240 

GCCCTGCCTC AGGAGG-GGGA GGTTCCCTTO TACGGCAGAG CCATTCCCCT TGCTTTTATA 3 00 

AAGGGTGGTA GGCJk.'TC'^Z^.r CTTCTGCCAT TCCAAGAAAA ATTGTGATGA ACTC 3 54 

(2) INFORMATION FOR SE2 ID NO : 55: 

(l) SEQUENCE C:-1ARACTERZSTICS : 

(A) LENGTH: 118 amino acids 

(B) TYPE: an^mo acid 

(C) STRA^rOEDNESS : single 

(D) TOPOLOGY ; l:Lnear 

(ii) MOLECULE T^^Z: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

Thr Thr Gly Ala Ser lie Thr* Tyr Ser Thr Tyr Gly Lys Phe Leu Ala 
15 10 15 

Asp Gly Gly Cys Ser Gly Gly Ala Kis Asp Val lie lie Cys Asp Glu 
20 25 30 
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Cys His Ser Gin Asp Ala Thr Thr lie Leu Gly He Gly Thr Val Leu 
35 40 45 



Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Xaa 
50 55 60 



Thr Fro Pre; 



Ser Val Thr Thr ?rc 
70 



Pro Asn He Glu Glu Val 
75 80 



65 



Ala Leu Pro Gin Glu Gly Glu Val Pro Phe Tyr Gly Arg Ala He Pro 
95 90 95 



Leu Ala Phe He Lys Gly Gly Arg His Leu He Phe Cys His Ser 
100 105 110 



Lys 



Lys Asn Cys Asp Glu Leu 
115 



{2} INFORMATION FOR SEQ ID NO: 57: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 54 base pairs 
(3) Tf-PZ: nucleic acid 

(C) STRAI^rOEDNESS . single 

(D) TOPOLOGY: linear 

(ii) MOLECULE T^/?E : cCNA 
(ill) E':fPOTHETI'CAL : NC 
(iil) ANTI-SE^iSE: NO 



(vii) IMMEDIATE SOLACE: 

(3) CLONE. PC-1-43 

(xx) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1. .3 54 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

ACCACCGGAG CTTCTATCAC ATACTCCACT TACGGCAAGT TCCTTGCTGA TGGAGGGTGT 6 0 

TCAGGCGGCG CGTATGACGT GATCATATGC GACGAGTGCC ATTCCCAGGA CGCCACCACC 12 0 

ATTCTTGGGA TAGGCACTGT CCTTGACCAG GCAGAGACGG CTGGAGCTAG GCTCGTCGTC 18 0 

TTGGNCACGG NCACCCCTCC CGGCAGTGTG ACAACGCCCC ACCCCAACAT CGAGGAAGTG 240 

GCCCTGCCTC AGGAGGGGGA GGTTCCCTTC TACGGNAGAG CCATTCCCCT TGCTTTTATA 3 00 

AAGGGTGGTA GGCATCTCAT CTTCTGCCAT TCCAAGAAAA AATGTGATGA ACTT 3 54 
(2) INFORMATION FOR SEQ ID NO: 53: 
(l) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 133 ammo acids 

(B) TYPE: ammo acid 

(C) STRANDECNESS : smgle 
{ D ) TOPOLOGY : 1 mear 

(ii) MOLECULE TYPE: protein 



(xi) SECuE^iCE DESCRIPTION: SEQ ID NO: 53: 

Thr Thr Gly Ala 5er He Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala 
15 10 15 

Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp Val lie lie Cys Asn Glu 
20 25 30 

Cys Hxs Ser Gin Asp Ala Thr Thr lie Leu Gly He Gly Thr Val Leu 
35 4C 45 

Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Xaa Thr Xaa 
50 55' 60 

Thr Pro Pro Gly Ser Val Thr Thr Pro Hxs Pro Asn He Glu Glu Val 
65 70 75 80 

Ala Leu Pro Gin Glu Gly Glu Val Pro Phe Tyr Xaa Arg Ala lie Pro 
85 9C 93 

Leu Ala Phe lie Lys Gly Gly Arg His Leu He Phe Cys His Ser Lys 

100 105 110 

Lys Lys Cys Asp Glu Leu Arg Gin Ala Thr Asp Gin Pro Gly Arg Glu 
115 120 125 

Arg Pro Trp Glu Tyr 
130 



{2} INFORMATION FOR SEQ ID NO: SB: 

(i) SEQUENCE CHAPJ^CTERISTICS : 

(A) LENGTH: 35 7 base pairs 

(B) T^/PE: nucleic acid 

(C) STRANDSDNSSS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(vii) IMMEDIATE SOLACE: 

(B) CLONE: PC-1-37 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
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(B) LOCATION: 1 . .357 

(xi) SEQUENCE DESCRIPTION: SEQ ID HO: 59: 

ATGGCTTTCA TGTCTCCGGA CTTGGAGGTC ATTACG^NCA CTTGGGTTCT GGTGGGGGGC 6 0 

GTTGTGGC3A CCCT3NCGNC CTACTCCTTG ACGGTGGGTT CGGTAGCCAT AGTC3GTAGG 120 

ATCATCCTCT CTGGGAAACC TGCCATCATT NCCGATAGGG AGGTATTATA CCAGCAATTT ISO 

GATGAGATGG AGGAGTGCTC GGCCTCGTTG CCCTATATGG ACGAAACACG TI^CCATTGCT 2 40 

GGACAATTCA AAGAGAAAGT GCTCGGCTTC ATCAGCACGA CCGGCCAGAA GGCTGAAACT 3 00 

CTGAAGCCGG CAGCCACGTC TGTGTGGAAC AAGGCTGATC AGTTCTGGNC CACATAC 3 57 
(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 128 antmo acids 

(B) TYPE: am:ino acid 

(C) ST31AMI:EDNES5 : Single 

(D) TOPOLOGY: linear 

(li) MOLECULE Ti^S: prccein 



(xi) SEQUENCE DESCRIPTION: SHQ ID NO: 60: 

Met Ala Fhe Met: Ser Pro Asp Leu Glu Val lie Thr Xaa Thr Trp Val 
15 10 15 

Leu Val Gly Gly Val Val Ala Thr Leu Xaa Xaa Tyr Cys Leu Thr Val 
20 25 30 

Gly Ser Val Ala lie Val Gly Arg lie lie Leu Ser Gly Lys Pro Ala 
35 40 45 

lie lie Xaa Asp Arg Glu Val Leu Tyr Gin Gin Phe Asp Glu Men Glu 

50 55 €0 

Glu Cys Ser Ala Ser Leu Pro Tyr Met Asp Glu Thr Arg Xaa lie Ala 

65 70 75 3D 

Gly Gin Phe Lys Glu Lys Val Leu Gly Phe lie Ser Thr Thr Gly Glr. 

85 90 95 

Lys Ala Glu Thr Leu Lys Pro Ala Ala Thr Ser Val Trp Asn Lys Ala 
100 105 110 

Asp Gin Phe Trp Xaa Thr Tyr Met Trp Asn Phe lie Ser Gly He Glr. 
115 120 125 



(2) INFORMATION FOR SEQ ID NO- 61: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 3S7 base pa.rs 
(3) TYPE: nucleic acid 

(C) STRAND£DNZSS : single 

(D) TOFCLCGY: linear 

(li) MOLECULE TYPE: cDNA 
{ L i 1 ) ^nrPCTHETI ZAL : NO 
(lii) ANTI-SENSE: NC 



(vii) IMMEDIATE SCCTRCE : 

(3) PC-1-48 

(ix) FEATUTIE: 

(A) NAME /KEY: CDS 
(3) LOCATION: 1 . .357 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

ATGGCTTGCA TGTCTGCGGA CCTGGAGGTC ATTACCANCA CTTGGGTTCT GGTGGGGGGC 

GTTGTGGCGN CZCZGOCZOZ CTACTGCTTG ACGGTGGGTT CGGTAGCCAT AGTCGGTAGG 

ATCATCCTCT CTC-C-GAA-ACC TGCCATCATT CCCGATAGGG AGGCATTATA CCA^JCAATTT 

GATGAGATGG AGGAGTGCTC GGCCTCGTTG CCCTATATGG ACGAGACAC3 TGCCATTGCC 

GGACAATTCA AAGAGAAAGI GCTCGGCTTC ATCAGCACGA CCGGCCAGAA GGCTGAAACT 

CTGAAGCCGG CAGCCACGTC TGTGTGGAAC AAGGCTGANC AGTTCTGGGC CACATAC 

(2) INFORMATION FOP SE2 ID NO: 62: 

Ci) SEQUENCE CHAPsACTERISTICS : 

(A) LENGTH: 123 amino acids 

(B) TYPE: ammo acid 

(C) STPJOuDEDNESS : Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TiVE : protein 



(xi) SEQL^NCE DESCRIPTION: SEQ ID NO: 62: 



Met Ala Cys Me- Ser Ala Asp Leu Glu Val lie Thr Xaa Thr Trp Val 
15 10 15 

Leu Val Gly Gly Val Val Ala Xaa Lea Ala Ala Tyr Cys Leu Thr Val 
20 25 3C 

Gly Ser Val Ala He Val Gly Arc He He Leu Ser Gly Lys Pro Ala 
35 40 45 

He He Pro Asp Arg Glu Ala Leu Tyr Xaa Gin Phe Asd Glu Met Glu 
50 55 so 
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Glu Cys Ser Ala Ser Leu Pro Tyr Met Asp Glu Thr Arg Ala lie Ala 
65 70 ' ' 75 80 

Gly Gin Phe Lys Glu Lys Val Leu Gly Phs lie Ser Thr Thr Gly Gin 
85 90 95 

Lys Ala Glu Thr leu Lys ?rc Ala Ala Thr Ser Val Trp Asn Lys Ala 
100 105 110 

Xaa Gin Phe Trp Ala Thr Tyr Mez Trp Asn Phe lie Ser Gly lie Gin 
115 120 125 

(2) INrCRMATION FOR SEQ ID NO: 63: 

i 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 8 base pa:Lr3 

(B) TYPE: nucleic acid 

(C) STRANDEDNES3 : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(lii) HYPOTHETICAL: YES 
(iii) ANTI-SENSE. NO 



(ix) FEATLUE: 

(A) NAME/KEY: misc_f eacure 
(3) LOCATION: 1 . .23 

(D) OTHER INFORMATION: / s car.dard_name= "HO/ Primer 
KCPrlSl" 



(xi) SEQUEIJCE DESCRIPTION: SEQ ID NO: 63: 

ACCGGAGGCC AGGAGAGTGA TCTCCTCC 2 8 

(2) INFORMATION FOR SEQ ID NO: 64; 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 2 3 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE T'/PE : DNA (genomic) 
(iii) HYPOTHETICAL: YES 
(iii) A^rri- SENSE. YES 

(ix) FEATTJRE: 

(A) NAME/KEY: mis c_f sature 

(B) LOCATION: 1 . .28 

(D) OTHER INFORMATION: /s tandard_name= "HC=/ Primer 
HCPrlS2 " 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

GGGCTGCTCT ATCCTCATCG ACGCCATC 28 

(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 28 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YES 
(iii) ANTI-SENSE: NO 



(ix) FEATURE:' 

(A) NAME/KEY: misc_feaeure 

(B) LOCATION: 1. .28 

(D) OTHER INFORMATION: /scandarz_name= "HCV Primer 
HC?rl€3" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 
GCCAGAGGCT CGGAAGGCGA TCAGCGCT 2 8 



(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YES 
(iii) ANTI-SENSE: YES 

(ix) FEATURE; 

(A) NAME/KEY: misc^feature 

(B) LOCATION: 1. .28 

(D) OTHER INFORMATION: / standard_name= "HCV Primer 
HC?rl64" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 
GAGCTGCTCT GTCCTCCTCG ACGCCGCA 28 
(2) INFORMATION FOR SEQ ID NO: 67: 
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(i) SEQUENCE OiAiiACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNE3S : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii> HYPOTHETICAL: YES 
(iii) ANTI-SENSE: NO 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 1. .23 

(D) OTHER INFORMATION: /sGandard_name= "HCV Primer 
HC?r23" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

CTCATGGGGT ACATTCCGCT 

(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 27 base pairs 
(3) T'^PE : nucleic acid 

(C) STRANDEDNE3S: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: T£S 
(iii) ANTI-SENSE: YES 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 1..28 

(D) OTHER INFORMATION: /scandard_iiame= "HCV Prime: 
HC?r54 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 
CTATTACCAG TTCATCATCA TATCCCA 
(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid* 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
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(iii) HYPOTHETICAL: YES 
(iii) ANTI-SENSE: NO 

(ix) FEATURE: 

(A) NAME/KEY: niisc_f eature 

(B) LOCATION: 1. . 28 

(D) OTHEa INFORMATION: /st:andard_name=. "HO/ Primer 
HCPrllS- 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 69: 
TTTTAAATAC ATCATGRCTG YATG 
(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YES 
(iii) ANTI- SENSE: YES 

(ix) FEATURE: 

(A) NAME/KEY: misc^f eature 

(B) LOCATION: 1. .28 

(D) OTHER INFORMATION: /standard_name= "HCV Pri 
HC?r6 5" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 
CTATTATTGT ATCCCRCTGA TGAARTTCCA CAT 



(2) INFORMATION FOR SEQ ID NO : 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: YES 

(iii) ANTI-SENSE: YES 

(ix) FEATURE: 

(A) NAME/KEY: misc feature 
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(3) LOCATION: 1, .28 

(D) OTHER INFORMATION: /standard_iiame= "HCV Primer 
HCPrlia : 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 

ACTAGTCGAC TAYTGATCCR CTATRWAIITT CCACAT 3 6 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 2 5 base pairs 
{3} TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YES 
(liil ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 1..23 

(D) OTHER INFORMATION: /st:andard_name= "HCV Primer 
HCPrll? : 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 
TTTTAAATAC ATCGCRCTGC ATGCA 
(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: YES 

(iii) ANTI-SENSE: YES 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 
(3) LOCATION: 1. .28 

(D) OTHER INFORMATION: /st:andard_name "KCV Primer 
HCPrll9 : 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 
ACTAGTCGAC TARTTGCATA GCCKRTTCAT CCAYTG 
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(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YES 
(ili) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 
(3) LOCATION: 1..2 8 

{V) OTHER INFORMATION: /standard^name 
HC?rl31 : 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 
GGAATTCTAG ACCTCTGGGA YGARAYTGGA ARTG 
(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRA^EDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YES 
(iii) ANTI-SENSE: NO 



(IX) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 1. .28 
(D) OTHER INFORMATION: /standard_name= "HCV Primer 

HCPrl30 : 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 
GGAATTCTAG ACGCTAYCAR GCACGTTGYG C 31 



(2) INFORMATION FOR SEQ ID NO : 7€ : 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 3 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YES 
(iii) AKTI- SENSE: MC 



(ix) FEATURE: 

(A) NAI^/KZY: iaxsc_f eature 

(B) LOCATION: 1. .28 

(D) OTHER INFORMATION: /standard_aame= "HCV Primer 
HC?rl34: 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 
CATATAGATG CCCACTTCCT ATC 
(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii} HYPOTHETICAL: YES 
(iii) ANTI-SENSE; YES 



(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 1. .28 

(D) OTHER INFORMATION: /standard_name= "HCV Primer 
HC?r3 : 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 
GTGTGCCAGG ACCATC 

(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; DNA (genomic) 
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(iii) HYPOTHETICAL: YES 
(iii) ANTI-SENSE: YES 



(ix) FEATURE: 

(A) NAME/KEY: misc_f eacure 

(B) LOCATION: 1..28 

(D) OTHEH INFORMATION: /scandard_naine=. "KC/ Primer 
HCPr4 : 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78; 

GACATGCATG TCATGATGTA 

(2) INFORMATION FOR SHQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 2 9 base pairs 
(31 TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

Cii) MOLEClFLE TYPE: DNA (genomic) 
(iii} HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(xx) FEATL'RE: 

(A) NAME/KEY: misc_feature 
(3) LOCATION: 1..28 

{D} OTHER INFORMATION: /standard_name= "HCV Primer 
HC?rl52 : 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 79: 

TACGCCTCTT CTATATCGGT TGGGGCCTG 

(2) INFORMATION FOR SEQ ID NO : 80: 

(il SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 26 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YES 
(iii) ANTI-SENSE: NO 
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(ix) FEATXJRE: 

(A) NAME/KEY: misc^feature 

(B) LOCATION: 1 . . 28 

(D) OTHER INFORMATION: /standard_name= "HCV Primer 
HCPrS2 : 



(xi) SHCCENC2 CHSC31IPTICN: SEQ ID NO: 80: 
ATGTTGGGTA AGGTCATCGA TACCCT 
(2) INrORJIATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YES 
Ciii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 1. .28 

(D) OTHER INFORMATION: /s tandard_name= "HCV Primer 
HC?r41: 



(xi) SEQUENCE DESCRIPTVION: SEQ ID NO : 81: 
CCCGGGAGGT CTCGTAGACC GTGCA 2 5 

(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQO'ENCE CHARACTERISTICS: 

(A) LENGTH: 2 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNES S : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YES 
(iii) ANTI-SENSE: YES 



(ix) FEATURE: 

(A) NAME/KEY: misc_f eacure 
(3) LOCATION: 1 . .28 

(D) OTHER INFORMATION: /standard_name- "HCV Primer 
HCPr4 0 : 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 
CTATTAAAGA TAGAGAAAGA GCAACCGGG 29 



iZ) INFORMATION FOR SZQ ID NO: 83: 

(i) SEQOENCS CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(viii) POSITION IN PROTEIN: 

(B) MAP POSITION: positions 192 to 203 of the VI region of HCV 

type 3 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 
Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val Leu 
IS 10 

(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(viii) POSITION IN PROTEIN: 

(B) MAP POSITION: positions 192 to 203 of the VI region of HCV 

type 5 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 

Val Pro Tyr Arg Asn Ala Ser Gly He Tyr His Val 
15 10 

(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
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(viii) PCSITICN IN ??.CTZIN: 

(B) MAP POSITION: positions 213 to 223 of the V2 region ofHCV 

type 3 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 

Val Tyr Glu Ala Asp Asp Val He Leu His Thr 
1 5 " ' 10 

(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 araxno acids 

(B) TYPE: am-no acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE T^^PE : peptide 
(iii) HYPOTHETICAL: NO 



(viii) POSITION IN PROTEIN: 

(B) MAP POSITION: positions 213 to 23 3 of the V2 region of HC7 

type 5 

(xi) SEQUENCE DESCRIPTION: SEQ ID IK): 86: 

Val Tyr Glu Ala Asp Asn Leu He Leu His Ala 
1 5 * 10 

(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDKESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE T^iTPE : peptide 
(iii) HYPOTHETICAL: NO 



(viii) POSITION IN PROTEIN: 

(B) MAP POSITION: positions 230 to 242 of the V3 region o 

type 3 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

Val Gin Asp Gly Asn Thr Ser Thr Cys Trp Thr Pro Val 
15 10 

(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: pepcide 
(iii) HYPOTHETICAL: NO 



(viii) POSITION Ur PROTEIN: 

(B) MAP POSITION: positions 230 to 242 of the V3 region of HO/ 

type 5 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

Val Met Thr Gly Asn Val Ser Arg Cys Trp Val Gin lie 
15 10 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(viii) POSITION IN PROTEIN: 

(B) MAP POSITION: positions 24 8 to 257 of the V4 region of HCV 

type 3 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 89: 

Val Arg Tyr Val Gly Ala Thr Thr Ala Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LiENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: pepcide 
(iii) HYPOTHETICAL: NO 



(viii) POSITION IN PROTZrN: 

(B) MX? PCSITICi:: positions 248 "zo 2S7 of the V4 region of HC/ 

type 5 



(xi) SEQUENCE DSSCHIPTrCN: SEQ ID J«D: 90: 

Ala Pro Ser Leu Gly Ala Val Thr Ala Pro 
15 10 

(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUE2TCE OiARACTERISTXCS : 

(A) LENGTH: 10 amino acids 
(Bl TYPE; amino acid 

(C) STRANDECNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(viii) POSITION IN PROTEIN: 

(B) MAP POSITION: positions 294 to 303 of the V3 region of HO/ 

type 3 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

Arg Pro Arg Arg His Gin Thr Val Gin Thr 
15 10 

(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNE3S : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(viii) POSITION IN PROTEIN: 

(B) MAP POSITION: positions 294 to 303 of the V5 region of HCV 

type 5 



(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 92: 
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Arg Pro Arg Gin Hi3 Ala Thr Val Gin Asn 
15 10 

(2) INFORMATION FOR SEQ ID NO: 93: 

(il SEQUENCE CHAiLACTERISTICS : 

(A) LENGTH: 9 amino acids 
(5) TYPE: amino acid 

(C) STHANDE2NH3S : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: pepcide 
(iii) HYPOTHETICAL: NO 

(viii) POSITION IN PROTEIN: 

(B) MA? POSITION; positions 70 to 78 of HCV type 5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 

Gin Pro Thr Gly Arg Ser Trp Gly Gin 
1 5 

(2 J INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TfPE: peptide 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(CI INDIVIDUAL ISOLATE: BR33 and BR3 6 

(viii) POSITION IN PROTEIN: 

(B) MAP POSITION: positions 230 to 237 of the V3 region of HCV 

type 3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 94: 

Val Gin Asp Gly Asn Thr Ser Thr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE C4ARACTERISTICS : 

(A) LENGTH: 8 an:iino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: Single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(Vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: HDIO 

(viii) POSITION IN PROTEIN: 

(B) MAP POSITION: positions 230 to 237 of Che V3 region of HCT 

type 3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

Val Gin Asp Gly Asn Thr Ser Ala 
1 5 

{2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: BR36 

(viii) POSITION IN PROTEIN: 

CB) MAP POSITION: positions 248 to 257 of the V4 region of KCT 

type 3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

Val Lys Tyr Val Gly Ala Thr Thr Ala Ser 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(vi) 'ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: BR36 
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(viii) POSITION IN GENOME: 

(B) MAP POSITION: Posicions 1688 to 1707 of HCV type 3 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

Leu Gly Gly Lys Pro Ala He Val Pro Asp Lys Glu Val Leu Tyr Gl- 

Gin Tyr Asp Glu 
20 

(2) INFORMATION FOR SEQ ID NO : 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 
{B} TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: HDIO 

(viii) POSITION IN GENOME: 

(B) MAP POSITION: positions 1688 to 1707 of HCV type 3 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

Leu Gly Gly Lys Pro Ala Leu Val Pro Asp Lys Glu Val Leu Tvr Gin 

Gin Tyr Asp Glu 
20 



(2) INFORMATION FOR SEQ ID NO; 99: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(viii) POSITION IN GENOME: 

(B) MA? POSITION: positions 1712 to 1731 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 99: 
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Ser Gin Ala Ala Pro Tyr lie Glu Gin Ala Gin Val He Ala His Gin 
1 S 10 15 

Phe Lys Glu Lys 
20 

(2} INFORMATION rCR SZQ ID NO: 100: 

(i) SEQUENCZ CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 
v3) TY?E: ammo acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(C) INDr^DUAL ISOLATE: BR36 

(viii) POSITION IN GENOME: 

(B) MA? POSITION: positions 1724 to 1743 of HCV type 3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 

He Ala His Gin Phe Lys Glu Lys Val Leu Gly Leu Leu Gin Arg Ala 
15 10 15 

Thr Gin Gin Gin 
20 

(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE OiARACTERISTICS : 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

CO STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: HDIO 

Cviii) POSITION IN GENOME: 

(B) MAP POSITION: positions 1724 to 1743 of HCV type 3 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 101: 

lie Ala H:ls Gin Phe Lys Glu Lys lie Leu Gly Leu Leu Gin Arg Ala 
15 10 15 
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Thr Gin Gin Gin 
20 



(2) INFORMATION rCR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 ammo acids 
(3) TYPE: amino acid 

(C) STitANTSDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPH : pepcide 
(iii) HYPOTHETICAL: NO 



(viii) POSITION IN GENOME: 

(B) MAP POSITION: positions 1638 to 1707 of HCT type 5 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 

Leu Ser Giy Lys Pro Ala He He Pro Asp Arg Glu Ala Leu Tyr Gin 
1 5 10 15 . 

Gin Phe Asp Glu 

20 

(2) INFORMATION FOR SEQ ID NO : 10 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: atnmo acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE T{VE : peptide 
(iii) HYPOTHETICAL: NO 



(viii) POSITION IN GENOME: 

(B) MA? POSITION: positions 1688 to 1707 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 

Leu Ser Gly Lys Pro Ala lis He Pro Asp Arg Glu Val Leu Tyr Gin 
15 10 15 

Gin Phe Asp Glu 

20 

(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 ammo acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
{111} H-/POTKHTICAL: NO 



(viii) POSITION IN GENOME: 

(B) MA? POSITION: position 1712 to 1731 of HCV type 5 



(xi) SEQUENCE DESCRIPTION: SEQ It) NO: 104: 

Ser Ala Ser Leu Pro Tyr Met Asp Glu Thr Arg Ala He Ala Gly Gin 
15 10 IS 

Phe Lys Glu Lys 
20 

(2) INFORMATION FOR SEQ ID NO: 105: 

(i} SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL; NO 



(viii) POSITION IN GENOME: 

(3) MA? POSITION: positions 1724 to 1743 of HCV type 5 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 



He Ala Gly Gin Phe Lys Glu Lys Val Leu Gly Phe He Ser Thr Thr 
15 10 15 

Gly Gin Lys Ala 
20 

(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CH-ARACTERISTICS : 

(A) LENGTH: 34 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
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(ill) ANTI-SENSE: NO 



(vii) IMMEDIATE SOTOCE : 

(B) CLONE: 0348-3-10 

(ix) FEATURE: 

(A) NAME /KEY: Ci:S 

(B) LOCATION: 2. .340 

(xi) SEQUENCE DESC^^IPTIQN : SEQ ID NO: 106: 

C TCC ACT GTA ACC GAA AAG GAC ATC AGG GTC GAG GAG GAG GTC TAT 46 
Ser Thr Val Thr Glu Lys Asp He Arg Val Glu Giu Glu Val Tyr 
IS 10 15 

CAG TGT TGT GAC CTG GAG CCC GAA GCC CGC AAG GCA ATT ACC GCC CTA 94 
Gin Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Ala He Thr Ala Leu 
20 25 30 

ACA GAG AGA CTC TAG GTG GGC GGT CCC ATG CAT AAC AGC AAG GGA GAC 14 2 

Thr Glu Arg Leu Tyr Val Gly Gly Pro Met: H^s Asr. Ser Lys Gly Asp 
35 40 45 

CTG TGC GGG TAT CGC AGA TGT CGC GCA AGC GGC GTC TAG ACC ACC AGC 190 
Leu Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser 
50 55 60 

TTC GGG AAC ACA CTG ACG TGC TAC CTC AAA GCC TCA GCC GCT ATC AAA 23 8 

Phe Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala He Lys 
65 70 73 

GCG GCG GGG CTG AGA GAC TGC ACC ATG TTG GTC TGT GGT GAT GAC CTG 2 36 

Ala Ala Gly Leu Arg Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu 
80 85 90 95 

GTT GTC ATC GCT GAG AGC GAT GGC GTA GAG GAG GAC AAA CGA CCC CTC 3 34 

Val Val He Ala Glu Ser Asp Gly Val Glu Glu Asp Lys Arg Pro Leu 
100 105 110 

GGA GCC 340 
Gly Ala 



(2) INFORMATION FOR SEQ ID NO: 10 7: 

(i) SEQUENCE CHAi?ACTERISTICS : 

(A) LENGTH: 113 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE, protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

Ser Thr Val Thr Glu Lys Asp He Arg Val Glu Glu Glu Val Tyr Gin 
15 10 15 
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Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Ala lie Thr Ala Leu Thr 
20 25 30 

Glu Arg Leu Tyr Val Gly Gly Pro Met His Asn Ser Lys Gly Asp Leu 
35 40 45 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser Phe 
50 55 SO 

Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala lie Lys Ala 
65 70 75 80 

Ala Gly Leu Arg Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val 
85 90 95 

Val lie Ala Glu Ser Asp Gly Val Glu Glu Asp Lys Arg Pro Leu Gly 
100 105 110 

t 

Ala 

(2) INFORMATION FOR SEQ ID NO: 108: 

( 1 ) SEQUENCE , CHARACTERISTICS : 

(A) LENGTH: 34 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANBEDNESS ; single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(lii) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: G3115-3-5 

(ix) FEATURE: 

(A) NAME/ICEY: CDS 

(B) LOCATION: 2.. 340 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 

C TCC ACT GTA ACC GAA AAG GAC ATC AGG GTC GAG GAG GAG GTA TAT 4 6 

Ser Thr Val Thr Glu Lys Asp lie Arg Val Glu Glu Glu Val Tyr 
15 10 15 

CAG TGT TGT GAC CTG GAG CCC GAG GCC CGC AGA GCA ATT ACC GCC CTA 94 
Gin Cys Cys Asp Leu Glu Pro Glu Ala Arg Arg Ala lie Thr Ala Leu 
20 25 30 

ACA GAG AGA CTC TAG GTG GGC GGT CCC ATG CAT AAC AGC AGG GGA GAC 14 2 

Thr Glu Arg Leu Tyr Val Gly Gly Pro Met His Asn Ser Arg Gly Asp 
35 40 45 
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CTG TGC GGG TAT CGC AGA TGC CGT GCG AGC GGC GTC TAG ACC ACC AGC 190 

Leu Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser 
50 55 60 

TTC GGG AAC ACA CTG ACG TGC TAT CTC AAA GCC TCA GCC GCT ATC AGA 23 8 

Phe Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala lie Arg 
65 70 75 

GCG GCG GGG CTG AGA GAC TGC ACC ATG TTG GTC TGT GGT GAT GAC CTG 286 

Ala Ala Gly Leu Arg Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu 
80 85 90 95 

GTC GTC ATT GCT GAA AGC GAT GGC GTA GAG GAG GAC AAA CGA GCC CTC 3 34 

Val Val lie Ala Glu Ser Asp Gly Val Glu Glu Asp Lys Arg Ala Leu 
100 105 ' 



GGA GCC 
Gly Ala 



340 



(2) INFORMATION FOR SEQ ID NO : 10 9: 

(il SEQUENCE CHAR-^CTEI^ISTICS : 
(A) LENGTH: 113 amino ac-ds 
(3) TYPE: amxno acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NC : 109: 

Ser Thr Val Thr Glu Lys Asp lie Arg Val Glu Glu Glu Val Tyr Gin 
15 10 15 

Cys Cys Asp Leu Glu Pro Glu Ala Arg Arg Ala He Thr Ala Leu Thr 
20 25 30 

Glu Arg Leu Tyr Val Gly Gly Pro Met: His Asn Ser Arg Gly Asp Leu 
35 40 45 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser Phe 
50 55 60 

Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala lis Arg Ala 
65 70 75 80 

Ala Gly Leu Arg Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu. Val 
85 90 95 

Val lie Ala Glu Ser Asp Gly Val Glu Glu Asp Lys Arg Ala Leu Gly 
100 105 110 

Ala 



(2) INFORMATION FOR SEQ ID NO: 110: 
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(x) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 0 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(lii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: GB215-3-8 

{ix} FEATURE: 

(A) NAME/KEY: CDS 
(3) LOCATION: 2. .340 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 

C TCC ACT GTA ACC GAA AAA GAC ATC AGG GTC GAG GAG GAG GTA TAT _^ 45 
Ser Thr Val Thr Glu Lys Asp He Arg Val Glu Glu Glu Val Tyr 
15 10 13 

CAG TGT TGT GAC CTG GAG CCC GAA GCC CGC AAG GTA ATT ACC GCC CTA 94 
Gin Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Val He Thr Ala Leu 
20 25 30 

ACA GAG AGA CTC TAT GTG GGC GGT CCC ATG CAT AAT AGC AAA GGA GAC 142 
Thr Glu Arg Leu Tyr Val Gly Gly Pro Met His Asn Ser Lys Gly Asp 
35 40 45 

CTG TGC GGG TAT CGC AGA TGC CGC GCA AGC GGC GTC TAC ACC ACC AGC 190 
Leu Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser 
50 55 60 

TTC GGG AAC ACA CTG ACG TGC TAT CTC AAA GCC TCA GCC GCC ATC AGG 2 33 

Phe Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala He Arg 
65 70 75 

GCG TCA GGG CTG AGA GAC TGC ACT ATG CTG GTC TAT GGT GAC GAC CTG 2 35 

Ala Ser Gly Leu Arg Asp Cys Thr Mec Leu Val Tyr Gly Asp Asp Leu 
ao 85 90 95 

GTC GTC ATT GCC GAG AGC GAT GGC GTA GAG GAG GAC AAA CGA GCC CTC 3 34 

Val Val He Ala Glu Ser Asp Gly Val Glu Glu Asp Lys Arg Ala Leu 
100 105 110 

GGA GTC 34 0 

Gly Val 



(2) INFORMATION FOR SEQ ID NO: 111: 
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(i) SEQUENCE CHAHACTERISTICS : 

(A) LENGTH: 113 amino acids 

(B) TYPE: ammo acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xl) SEQL^ENCZ CESCP-IPTION: SEQ ID NO: 111: 

Ser Thr Val Thr Glu Lys Asp lie Arg Val Glu Glu Glu Val Tyr Gin 
IS 10 15 

Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Val He Thr Ala Leu Thr 
20 25 30 

Glu Arg Leu Tyr Val Gly Gly Pro Mec His Asn Ser Lys Gly Asp Leu 
35 40 45 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser Phe 
50 55 60 

Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala He Arg Ala 
65 70 75 80 

Ser Gly Leu Arg Asp Cys Thr Mec Leu Val Tyr Gly As? Asp Leu Val 
85 90 95 

Val He Ala Glu Ser Asp Gly Val Glu Glu Asp Lys Arg Ala Leu Gly 
100 105 110 

Val 



(2) INFORMATION FCR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34C base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(ili) HYPOTHETICAL: NO 
(iil) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: GB353-3-3 

(ix) FEATURE: 

(A) NAxME/KEY: CDS 

(B) LOCATION: 2 . .34 0 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 
C TCC ACT GTA ACC GAA AAG GAC ATC AGG GTC GAG GAG GAG GTG TAT 4 6 
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Ser Thr Val Thr Glu Lys Asp lie Arg Val Glu Glu Glu Val Tyr 
1 5 10 15 

CAG TGT TGT GAC CTG GAG CCC GAG GCC CGC AAG GCA ATT ACT GCC CTA 94 
Gin Cys Cys Asp Leu Giu Pro Glu Ala Arg Lys Ala He Thr Ala Leu 
20 2S 30 

ACA GAG AGA CTC TAT GTG GGC GGT CCC ATG CAT AAC AGC AAG GGA GAC 142 
Thr Glu Arg Leu Tyr Val Gly Gly Pro Met His Asn Ser Lys Gly As? 
35 40 45 

CTG TGT GGG TAT CGC AGA TGC CGC GCA AGC GGC GTC TAC ACC ACC AGC 130 
Leu Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser 
50 55 60 

TTC GGG AAC ACA CTG ACG TGC TAC CTC AAA GCC TCA GCC GCT ATC AGA 23 8 

Phe Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala He Arc 
65 70 75 

GCG GCG GGG CTG AGA GAC TGC ACC ATG TTG GTC TGT GGT GAT GAC CTG 2 86 

Ala Ala Gly Leu Arg Asp Cys Thr Met Leu Val Cys Gly As? Asp Leu 
80 85 90 95 

GTC GTC ATC GCT GAG AGC GAT GGC GTT GAG GAG GAC AAA CGA GCC CTC 3 34 

Val Val He Ala Glu Ser Asp Gly Val Glu Glu Asp Lys Arg Ala Leu 
100 105 110 



GGA GCC 
Gly Ala 



(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 113 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi} SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

Ser Thr Val Thr Glu Lys Asp He Arg Val Glu Glu Glu Val Tyr Gin 
15 10 15 

Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Ala He Thr Ala Leu Thr 
20 25 30 

Glu Arg Leu Tyr Val Gly Gly Pro Met Hls Asn Ser Lys Gly Asp Leu 
35 40 45 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser Pha 
50 55 60 

Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala He Arg Ala 
65 70 75 80 



340 
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Ala Gly Leu Arg Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val 
85 90 95 

Val lie Ala Glu Ser Asp Gly Val Glu Glu Asp Lys Arg Ala Leu Gly 
100 105 110 

Ala 



(2) INFORMATIOK FOR SSQ ID NO: 114: 

(i) SEQUENCE CKABACTERISTICS : 

(A) LENGTH: 340 base pairs 

(B) TYPE: nucleic acid 

(C) STRANBEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(ill) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: G35-;9-3-o 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
(3) LOCATION: 2.. 340 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 

C TCC ACG GTG ACC GAA AGG GAT ATC AGG ACC GAG GAA GAG ATC TAC 4 5 

Ser Thr Val Thr Glu Arg Asp lie Arg Thr Glu Glu Glu lie Tyr 
15 10 15 

CAG TGC TGC GAC CTG GAG CCC GAA GCC CGC AAG GTG ATA TCC GCC CTA 94 
Gin Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Val He Ser Ala Leu 
20 25 30 

ACG GAA AGA CTC TAC GTG GGC GGT CCC ATG TAC AAC TCC AAG GGG GAC 14 2 

Thr Glu Arg Leu Tyr Val Gly Gly Pro Mec Tyr Asn Ser Lys Gly Asp 
35 40 45 

CTA TGC GGG CAA CGG AGG TGC CGC GCA AGC GGG GTC TAC ACC ACC AGC 190 
Leu Cys Gly Gin Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser 
50 55 60 

TTC GGG AAC ACT GTA ACG TGT TAT CTC AAG GCC GTT GCG GCT ACT AGG 23 8 

Phe Gly Asn Thr Val Thr Cys Tyr Leu Lys Ala Val Ala Ala Thr Arg 
65 70 75 

GCC GCA GGT CTG AAA GGT TGC AGC ATG CTG GTT TGT GGA GAC GAC TTA 23 6 

Ala Ala Gly Leu Lys Gly Cys Ser Mec Leu Val Cys Gly Asp Asp Leu 
80 85 90 95 
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GTC GTC ATC TGC GAG AGC GOC GGC GTA GAG GAG GAT GCA AGA GCC CTC 3 34 

Val Val lie Cys Glu Ser Gly Gly Val Glu Glu Asp Ala Arg Ala Leu 
100 105 110 

CGA GCC 340 
Arg Ala 



(2) INFGRMATIGH FOR SSQ ID NO: 115: 

(i) SEQUHNCZ CHARACTERISTICS: 

(A) LENGTH: 113 amino acids 

(B) TYPE: am::.nc acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: procein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 

Ser Thr Val Thr Glu Arg Asp lie Arg Thr Glu Glu Glu He Tyr Gin 
1 5 ' 10 15 

Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Val lie Ser Ala Leu Thr 
20 25 30 

Glu Arg Leu Tyr Val Gly Gly Pro Met Tyr Asn Ser Lys Gly Asp Leu 
35 40 45 

Cys Gly Gin Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser Phe 
50 55 60 

Gly Asn Thr Val Thr Cys Tyr Leu Lys Ala Val Ala Ala Thr Arg Ala 
65 70 75 80 

Ala Gly Leu Lys Gly Cys Ser Met Leu Val Cys Gly Asp Asp Leu Val 
85 90 95 

Val lie Cys Glu Ser Gly Gly Val Glu Glu Asp Ala Arg Ala Leu Arg 
100 105 110 

Ala 



(2) INFORMATION FOR SEQ ID NO: 116: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(li) MOLECJLE T^/PE: cDNA 
(lii) HYPOTHETICAL: NO 
(iil) ANTI-SENSE: NO 
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(via.) IMMEDIATE SOURCE: 

(B) CLONE: Gaa09-3-l 

(ix) FEATURE: 

(A) MAME/KEY: CDS 

(B) LXDCATION: 2 . .340 



(xi) SEQXJESCE DE3C^II?TI0N : SEQ ID NO: 115: 

C TCC ACT GTG ACT GAG AGA GAC ATC AAG GTC GAA GAA GAA GTC TAT 46 
Ser Thr Val Thr Glu Arg Asp He Lys Val Glu Glu Glu Val Tyr 
15 10 IS 

CAG TGT TGT GAT CTG GAG CCC GAG GCC CGC AAG GTA ATA GCC GCC CTC 94 
Gin Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Val He Ala Ala Leu 
20 25 30 

ACG GAG AGA CTC TAG GTG GGC GGC CCC ATG CAT AAC AGC AAG GGA GAC 142 
Thr Glu Arg Leu Tyr Val Gly Gly Pro Met: His Asn Ser Lys Gly As? 
35 40 45 

CTT TGC GGG TAT CGT AGA TGC CGC GCG AGC GGC GTA TAG ACC ACC AGC 190 
Leu Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser 
50 55 60 

TTC GGG AAC ACA ATG ACG TGC TAC CTT AAG GCC TCA GCA GCC ATC AGG 23 8 

Phe Gly Asn Thr Met Thr Cys Tyr Leu Lys Ala Ser Ala Ala He Arg 
65 70 75 

GCT GCG GGG CTA AAG GAT TGC ACC ATG CTG GTT TGC GGT GAC GAC CTA 296 
Ala Ala Gly Leu Lys Asp Cys Thr Men Leu Val Cys Gly Asp Asp Leu 
80 85 90 95 

GTC GTG ATC GCC GAG AGC GGT GGC GTT GAG GAG GAC AAA CGA GCC CTC 3 34 

Val Val He Ala Glu Ser Gly Gly Val Glu Glu Asp Lys Arg Ala Leu 
100 105 110 

GGA GCT 340 
Gly Ala 



(2) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 113 ammo acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

til) MOLECULE TYPE: protein 

(xx) SEQUENCE DESCRIPTION: 3EQ ID NO: 117: 

Ser Thr Val Thr Glu Arg Asp He Lys Val Glu Glu Glu Val Tyr Gin 
15 10 15 
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Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Val lie Ala Ala Leu Thr 
20 25 30 

Glu Arg Leu Tyr Val Gly Gly Pro Met: His Asn Ser Lys Gly Asp Leu 
35 40 45 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser Phe 

SO 55 SO 

Gly Asn Thr Mec Thr Cys Tyr Leu Lys Ala Ser Ala Ala lie Arg Ala 
65 70 75 80 

Ala Gly Leu Lys Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val 
as 90 95 

Val lie Ala Glu Ser Gly Gly Val Glu Glu Asp Lys Arg Ala Leu Gly 
100 105 110 

Ala 



(2) INFORMATION FOR SEQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 74 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNES3 : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(xii) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE: 

(3) CLONE: GB353-4-I 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1. .574 



(Xi) SEQUENCE DESCRIPTION: SSQ ID NO : 118: 

ACT TGC GGC TTT GCC GAC CTC ATG GGA TAC ATC CCG CTC GTA GGC GCC 4 8 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala 
15 10 15 

OCT GTG GGT GGC GTC GCC AGG GCC CTG GCA CAC GGT GTT AGG GCT GTG 96 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 

GAG GAC GGG ATC AAT TAT GCG ACA GGG AAT CTT CCC GGT TGC TCT TTC 14 4 

Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 
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TCT ATC TTC CTC TTG GCA CTT CTT TCG TGC CTG ACT GTT CCC ACC TCG 192 
Ser He Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Thr Ser 
50 55 60 

GCC GTC AAC TAT CGC AAT GCC TCG GGC ATC TAT CRC ATC ACC AAT GAC 24 0 

Ala Val Asa Tyr Arg Asn Ala Ser Gly He Tyr His He Thr Asn Asd 
65 70 75 80 

TGC CCG AAC TCG AGC ATA GTG TAC GAG ACC GAG CAC CAC ATC CTA CAC 288 
Cys Pro Asa Ser Ser lie Val Tyr Glu Thr Glu His His He Leu Hxs 
85 90 95 

CTC CCA GGG TGT TTA CCC TGC GTG AGG GTT GGG AAT CAG TCA CGC TGC 3 36 

Leu Pro Gly Cys Leu Pro Cys Val Arg Val Gly Asn Gin Ser Arg Cys 
100 105 110 

TGG GTG GCC CTC ACT CCC ACC GTG GCG GCG CCT TAC ATC GGC GCT CCG 3 84 

Trp Val Ala Leu Thr Pro Thr Val Ala Ala Pro Tyr He Gly Ala Pro 
115 120 125 



CTT GAA TCC CTC CGG AGT CAT GTG GAT CTG ATG GTA GGT GCC GCT ACT 
Leu Glu Ser Leu Arg Ser His Val Asp Leu Met Val Gly Ala Ala Thr 
130 135 140 



43: 



GCG TGC TCC GCT CTT TAC- ATC GGA GAC CTG TGC GGT GGC GTA TTC TTG 48 0 

Ala Cys Ser Ala Leu Tyr He Gly Asp Leu Cys Gly Gly Val Phe Leu 

^45 150 155 160 

GTT GGT CAG ATG TTC TCT TTC CAG CCG CGG CGC CAC TGG ACT ACG CAG 52 8 

Val Gly Gin Met Phe Ser Phe Gin Pro Arg Arg His Trp Thr Thr Gin 

1S5 170 175 

GAC TGC AAT TGT TCC ATC TAC GCG GGG CAC GTT ACG GGC CAC AGG A 574 

Asp Cys Asn Cys Ser He Tyr Ala Gly His Val Thr Gly His Arg 
180 185 190 



(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala 
is 10 15 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 

Glu Asp Gly He Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

Ser He Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Thr Ser 
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50 55 60 

Ala Val Asn Tyr Arg Asn Ala Ser Gly lie Tyr His lie Thr Asn Asp 
65 70 75 80 

Cys Pro Asn Ser Ser lie Val Tyr Glu Thr Glu His His He Leu His 
35 90 95 

Leu Pro Gly Cys Leu Pro Cys Val Arg Val Gly Asn Gin Ser Arg Cys 
100 105 110 

Trp Val Ala Leu Thr Pro Thr Val Ala Ala Pro Tyr He Gly Ala Pro 
115 120 125 

Leu Glu Ser Leu Arg Ser His Val Asp Leu Met Val Gly Ala Ala Thr 
130 135 140 

Ala Cys Ser Ala Leu T^/r He Gly Asp Leu Cys Gly Gly Val Phe Leu 
145 150 155 160 

Val Gly Gin Met Phe Ser Phe Gin Pro Arg Arg His Trp Thr Thr Gin 
IcS 170 175 

Asp Cys Asn Cys Ser He T^/r Ala Gly His Val Thr Gly His Arg 
130 185 190 

(2) INFORMATION FOR SZQ ID NO: 120: 

(i) SEQUFNCH CKAJtACTERXSTICS : 

(A) LENGTH: 5 74 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDE3NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: G3549-4-3 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1 . . 574 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 

ACG TGC GGC TTT GCC GAC CTC ATG GGA TAC ATC CCG CTC GTG GGC GCC 4 3 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala 
1 5 10 15 

CCT GTG GGT GGC GTC GCC AGG GCC TTG GCA CAT GGT GTC AGG GCC GTG 95 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 
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GAG GAC GGG ATT AAC TAT GCA ACA GGG AAT CTT CCC GGT TGC TCC TTT 144 
Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 

35 40 45 

TCT ATC TTC CTT CTA GCA CTT CTC TCG TGC TTG ACT GTC CCG GCC TCG 192 
Ser lie Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 55 60 

GCG CAG CAC TAC CGG AAC ATC TCG GGC ATT TAT CAC GTC ACC AAT GAC 24 0 

Ala Gin His Tyr Arg Asn lie Ser Gly lie Tyr His Val Thr Asn Asp 
^5 70 75 80 

TGC CCG AAC TCT AGT ATA GTG TAT GAA GCT GAC CAT CAT ATC ATG CAT 288 
Cys Pro Asn Ser Ser lie Val Tyr Glu Ala Asp His His lie Met His 
85 90 95 

CTA CCA GGG TGT GTG CCT TGC GTG AGA ACC GGG AAC ACC TCG CGC TGC 3 36 

Leu Pro Gly Cys Val Pro Cys Val Arg Thr Gly Asn Thr Ser Arg Cys 
100 105 110 

TGG GTT CCT TTA ACA CCC ACT GTG GCT GCC CCC TAT GTT GGC GCG CCG 3 84 

Trp Val Pro Leu Thr Pro Thr Val Ala Ala Pro Tyr Val Gly Ala Pro 
115 120 125 

CTC GAA TCC ATG CGG CGG CAC GTG GAC TTA ATG GTG GGT GCC GCC ACC 43 2 

Leu Glu Ser Met Arg Arg His Val Asp Leu Mec Val Gly Ala Ala Thr 
130 135 140 

GTC TGC TCG GCC CTG TAC ATC GGA GAC CTT TGC GGA GGT GTC TTC CTG 480 
Val Cys Ser Ala Leu Tyr lie Gly Asp Leu Cys Gly Gly Val Phe Leu 
l^S 150 155 160 

GTC GGG CAG ATG TTC ACC TTC CGG CCG CGC CGC CAT TGG ACT ACC CAG 52 3 

Val Gly Gin Met Phe Thr Phe Arg Pro Arg Arg His Trp Thr Thr Gin 
165 170 175 

GAC TGC AAC TGC TCT ATC TAT GAT GGC CAC ATC ACC GGC CAT AGA A 5 74 

Asp Cys Asn Cys Ser lie Tyr Asp Gly Kis He Thr Gly His Arg 
180 135 190 



(2) INFORMATION FOR SSQ ID NO : 121: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala 
15 10 15 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 



SUBSTITUTE SHEET (RULE 26) 



t 1! ifh -I-'' ^rii 

wo 94/25601 ^ Krr/EI*94/0I323' - 

183 

Giu Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 

35 40 45 

Ser lie Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
SO 55 60 

Ala Gin His Tyr Arg Asn lie Ser Gly lie Tyr His Val Thr Asn Asp 
65 70 75 aO 

Cys Pro Asa Ser Ser lie Val Tyr Glu Ala Asp His His lie Met His 
85 90 95 

Leu Pro Gly Cys Val Pro Cys Val Arg Thr Gly Asn Thr Ser Arg Cys 
ICQ 105 110 

Trp Val Pro Leu Thr Pro Thr Val Ala Ala Pro Tyr Val Gly Ala Pro 
115 120 125 

Leu Glu Ser Met Arg Arg His Val Asp Leu Met Val Gly Ala Ala Thr 
130 135 140 

Val Cys Ser Ala Leu Tyr lie Gly Asp Leu Cys Gly Gly Val Phe Leu 
145 150 155 160 

Val Gly Gin Met Phe Thr Phe Arg Pro Arg Arg Rls Trp Thr Thr Gin 
165 170 175 

Asp Cys Asn Cys Ser lie Tyr Asp Gly His lis Thr Gly His Arg 
180 135 190 



(2) INFORMATION FOR SEQ ID NO: 122: 

(i} SHQUZNCE CHARACTERISTICS: 

(A) LENGTH: 574 base pairs 

(B) TYPE: nucleic acid 
iC) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; CDNA 

(ill) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: GB809-4-3 

( ix) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 1 . . 574 



ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 
ACG TGC GGC TTC GCC GAC CTC ATG GGA TAC ATC CCG CTC GTG GGC GCC 
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Thr Cys Gly Phe Ala Asp Leu Mec Giy Tyr lie Pro Leu Val Gly Ala 
15 10 15 

CCC GTT GGG GGC GTC GCC AGG GCC CTG GCG CAT GGC G7C AGG GCT GTG 96 
Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 

GAG GAC GGG ATT AAC TAT GCG AOl GGG AAT CTT CCC GGT TGC TCT TTC 144 
Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

TCT ATT TTC CTC CTG GCA CTT CTT TCG TGC CTC ACT GTC CCA GCG TCA 192 
Ser lis Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 55 60 

GCT GAG CAC TAC CGG AAT GCT TCG GGC ATC TAT CAC ATC ACC AAT GAC 2 40 

Ala Glu His Tyr Arg Asn Ala Ser Gly lie Tyr His lie Thr Asn Asp 
65 70 75 80 

TGT CCG AAT TCC AGC GTA GTC TAT GAA ACT GAC CAC CAT ATA TTG CAC 28 8 

Cys Pro Asn Ser Ser Val Val Tyr Glu Thr Asp Hz.s His lie Leu His 
85 90 95 

TTG CCG GGG TGC GTA CCC TGC GTG AGG GCC GGG AAC GTG TCT CGT TGC 3 36 

Leu Pro Gly Cys Val Pro Cys Val Arg Ala Gly Asn Val Ser Arg Cys 

100 105 110 

TGG ACG CCG GTA ACA CCT ACG GTG GCT GCC GTA TCC ATG GAC GCT CCG 3 34 

Trp Thr Pro Val Thr Pre Thr Val Ala Ala Val Ser Mec Asp Ala Pro 
115 120 125 

CTC GAG TCC TTC CGG CGG CAT GTG GAC CTA ATG GTA GGT GCG GCC ACC 43 2 

Leu Glu Ser Phe Arg Arg His Val Asp Leu Mec Val Gly Ala Ala Thr 
130 135 140 

GTG TGT TCT GTC CTC TAT GTT GGA GAC CTC TGT GGA GGT GCT TTC CTA 43 0 

Val Cys Ser Val Leu Tyr Val Gly Asp Leu Cys Gly Gly Ala Phe Leu 
145 150 155 160 

GTG GGG CAG ATG TTC ACC TTC CAG CCG CGT CGC CAC TGG ACC ACG CAG 52 3 

Val Gly Gin Mec Phe Thr Phe Gin Pro Arg Arg His Trp Thr Thr Gin 
165 170 175 

GAT TGT AAT TGC TCC ATC TAT ACT GGC CAT ATC ACC GGC CAC AGG A 5 74 

Asp Cys Asn Cys Ser lie Tyr Thr Gly His lie Thr Gly His Arg 
180 185 190 



(2) INFORMATION FOR SHQ ID NO: 123: 

(i) SEQUENCZ CHAiUi^CTERISTICS : 

(A) LENGTH: 191 ammo acids 

(B) TYPE: amir.o acid 
(D} TOPOLOGY: linear 

(11) MOLECULE TYPE: procein 

(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 



SUBSTITUTE SHEET {RULE 26) 



Cji ^iFh m ifo^ ^ a> n jl 

wo 94/25601 ^ PCT/EP94/01323 

185 

Thr Cys Gly Phe Ala Asp Leu Mec Gly Tyr lie Pro Leu Val Gly Ala 
IS 10 15 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 

Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

Ser He Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 55 60 

Ala Glu His Tyr Arg Asn Ala Ser Gly He Tyr His He Thr Asn Asp 
65 70 75 80 

Cys Pro Asn Ser Ser Val Val Tyr Glu Thr Asp His His He Leu His 
85 90 95 

Leu Pro Gly Cys Val Pro Cys Val Arg Ala Gly Asn Val Ser Arg Cys 
100 105 110 

Trp Thr Pro Val Thr Pro Thr Val Ala Ala Val Ser Mec Asp Ala Pro 
lis 120 125 

Leu Glu Ser Phe Arg Arg His Val Asp Leu Mec Val Gly Ala Ala Thr 
130 135 140 

Val Cys Ser Val Leu Tyr Val Gly Asp Leu Cys Gly Gly Ala Phe Leu 
145 150 155 160 

Val Gly Gin Met Phe Thr Phe Gin Pro Arg Arg His Trp Thr Thr Gin 
165 170 175 

Asp Cys Asn Cys Ser He Tyr Thr Gly His He Thr Gly His Arg 
130 195 190 

(2) INFORMATION FOR SEQ ID NO: 124: 
(i) SHQUENCS CHARACTERISTICS: 
(A) LENGTH: 31 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECUX.E TYPE: DNA (genomic) 
(iil) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: misc_feature 
(3) LOCATION: 1 . . 31 

(D) OTHER INFORMATION^ /standard name= "KCV Primer HC?r206" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 
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TGGGGATCCC GTATGATACC CGCTGCTTTG A 
(2) INFORMATION FOR SEQ ID NO: 125: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEI:nzS£ : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: YES 



(ix) FEATURE: 

(A) NAME/KEY: m:LSc_f earure 
(3) LOCATION: 1. .30 
(D) OTHER INFORMATION: /standard_name 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 

GGCGGAATTC CTGGTCATAG CCTCCGTGAA 



(2) INFORMATION FOR SEQ ID NO: 126: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3) TYPE: ammo acid 

(C) STRANDEDNESS : s:.ngle 

(D) TOPOLOGY: linear 

(ii) MOLECULE T^iTPE : pepcide 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: amino acid 

(C) INDIVIDUAL ISOLATE: G33S8 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12S : 

Val Asn Tyr Arg Asn Ala Ser Gly lie Tyr His lie 
15 10 
(2) INFORMATION FOR SEQ ID NO: 127: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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= "HO/ Primer HcPr207" 
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(vi) ORIGINAL SCORCH: 

(A) ORGANISM: Amino acid 

(C) lUDrVTDOAL ISOLATE: GBS4 9 



(Xi) SHQUZNC3 rZSCAlPTION: SZQ ZD NO: 127: 

Gin His Tyr Arg Asn lie Ser Gly lie Tyr His Val 
15 10 

(2) INFORMATION FOR SSQ 13 NO: 129: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 ammo acids 

(B) TYPE: am:Lnc acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(lli)^ HYPOTHETICAL: XO 

(vi) ORIGINAL SOL-RCZ: 

(A) ORGANISM: A.T.ino acid 

(C) INDr/IDCAL ISOLATE: G3809 



(xi) SSQL^NCE DESCRIPTION: SEC ID NO: 128: 

Glu His Tyr Arg Asn Ala Ser Gly lie Tyr His lie 
15 10 

(2) INFORMATION FOR SEC I- NO : 12 9: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 arr.ino acids 

(B) TYPE: araina acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NC 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: aniino acid 

(C) INDIVIDUAL ISOLATE: 033 58 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 12 9: 

Val Tyr Glu Thr Glu His Hi3 lie Leu His Leu 
15 10 

(2) INFORMATION FOR SEQ ID NO: 13 0: 
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(i) SEQOENCE CHAi?ACT£RISTICS : 

(A) LENGTH: 11 amxno acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECL-LE TYPE: pepcide 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: amno acid 

(C) INDIVIDUAL ISOLATE; GB54 9 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 

Val Tyr Glu Ala Asp His Kis lie Met: His Leu 
15 10 

(2) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amno acids 
(3) TYPE: amine acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE ri'PE: cepCide 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: arr.mo acid 

(C) INDIVIDUAL ISOLATE: GB8 09 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 131: 

Val Tyr Glu Thr Asp His His lie Leu His Leu 
15 10 

(2) INFORMATION FOR SEQ ID NO: 13 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: ammo acid 

(C) INDIVIDUAL ISOLATE: G3358 
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(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 132: 

Val Arg Val Gly Asn Gin Ser Arg Cys Trp Val Ala Leu 
1 5 10 

(2) INFORMATION FOR SEQ ID NO : 133: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: ammo acid 

(C) STRANDED^IESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE r^PS : peptide 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: ammo acid 

(C) INDIVIDUAL ISOLATE: GB549 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 

Val Arg Thr Gly Asn Thr Ser Arg Cys Trp Val Pro Leu 
1 * 5 10 

(2) INFORMATION FOR SEQ ID NO: 134: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 ammo acids 

(B) TYPE: amino ac:Ld 

(C) STRA^rDEDN'ESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECL-LE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOL-RCE: 

(A) ORGANISM: amino acid 

(C) INDIVIDUAL ISOLATE: GBSOS 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 4: 

Val Arg Ala Gly Asn Val Ser Arg Cys Trp Thr Pro Val 
1 ' 5 10 

(2) INFORMATION FOR SEQ ID NO; 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 
(3) TfPE: amino acid 

(C) STRANDEDN'ESS : Single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM; amino acid 

(C) INDIVIDUAL ISOLATE; GB35a 
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(xi) SEQUENCE DESGIIPTION: SEQ ID NO: 135: 

Ala Pro Tyr lie Gly Ala Pro Leu Glu Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO: 136; 

(i) SEQUENCE C-iARACTSRISTICS : 

(A) LENGTH: 10 amino acids 
(3) TYPE: amino acid 
iC) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL; NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: amino acid 

(C) INDIVIDUAL ISOLATE: G354 9 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 

Ala Pro Tyr Val Gly Ala Pro Leu Glu Ser 
1 5 10 

(2) INFORMATION FOR SEQ ID NO : 13 7; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOL-RCE: 

(A) ORGANISM: amino acid 

(C) INDIVIDUAL ISOLATE; 03809 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 

Ala Val Ser Met Asp Ala Pro Leu Glu Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO: 138: 
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(i) SEQUENCE OiARACTERISTICS: 

(A) LENGTH: 10 ammo acids 

(B) TYPE; amino acid 

(C) STRANDEDNES3 : single 

(D) TOPOLOGY: linear 

(-1) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(VI) ORIGINAL SOUTiCE: 

(A) ORGANISM: amino acid 

(C) INDIVIDUAL ISOLATE: 63353 and GB809 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: 

Gin Pro Arg Arg His Trp Thr Thx Gin Asp 
15 10 

(2) INFORMATION FOR SEQ ID NO: 139: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: IC arr.ino acids 

(B) TYPE: ammo acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOL-RCE: 

(A) ORGANISM: aniino acid 

(C) INDIVIDUAL ISOLATE: G3549 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 

Arg Pro Arg Arg His Trp Thr Thr Gin Asp 
15 10 



(2) INFORMATION FOR SEQ ID NO: 140: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 ammo acids 

(B) TYPE: amino acid 

(C) STPJ^lNDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(lii) HYPOTHETICAL: NC 
(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: amino acid 

(C) INDIVIDUAL ISOLATE: GB549 



(xi) SEQUENCE DESCRIPTION: SEQ ID ND: 140: 

Arg Pro Arg Arg His Trp Thr Thr Gin Asp 
1 5 lo' 



(2) INFORMATION FOR SEQ ID NO : 141: 

(i) SEQUENCE CHAE^CTERISTICS : 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNSSS : single 

(D) TOPOLOGY: Imear 

(ii) MOLECULE TYPE: cDNA 
{ii:L) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 
TGGGATATGA TGATGAACTG GTC 
(2) INFORMATION FOR SEQ ID NO: 142: 

(i) SEQL-ENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) T^rPE: nucleic acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) H^jfPOTHETICAL: NO 
(iii) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 
CCAGGTACAA CCGAACCAAT TGCC 
(2) INFORMATION FOR SEQ ID NO: 143: 

(i) SEQUENCE CKAPJIlCTERZSTICS : 

(A) LENGTH: 957 base p^irs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLSCOLS TYPE: CDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



FSATDRS: 

(A) NAME/KT/: 

(B) LOCATION: 

FEATURE: 
(A) NAME/KSY: 
(3) LOCATION: 



CDS 

1. .957 



mat:_peptide 
1. .954 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143: 

ATG AGC ACA AAT CCT AAA CCT CAA AGA AAA ACC AAA AGA AAC ACT AAC 4 3 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
15 10 15 

CGC CGC CCA CAG GAC GTC AAG TTC CCG GGC GGT GGC CAG ATC GTT GGT 95 
Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 

GGA GTA TAG TTG TTG CCG CGC AGG GGC CCC CGG TTG GGT GTG CGC GCG 144 
Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

ACG AGG AAA ACT TCC GAG CGG TCC CAG CCA CGT GGG AGG CGC CAG CCC 192 
Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 €0 



ATC CCC AAA GAT CGG CGC CCC ACT GGC AAG TCC TGG GGA AAA CCA GGA 24 0 

lie Pro Lys Asp Arg Arg Pro Thr Gly Lys Ser Trp Gly Lys Pro Gly 
65 70 75 80 

TAG CCT TGG CCC CTG TAC GGG AAT GAG GGC CTC GGC TGG GCA GGG TGG 28 8 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Leu Gly Trp Ala Gly Trp 
85 90 95 



CTC CTG TCC CCC CGA GGG TCT CGC CCG TCA TGG GGC CCA ACT GAC CCC 336 
Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro 
100 105 110 

CGG CAC AGG TCA CGC AAC TTG GGT AAG GTC ATC GAT ACC CTT ACG TGT 3 34 

Arg His Arg Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys 
115 120 125 

GGC TTT GCC GAC CTC ATG GGG TAC ATC CCT GTC GTC GGC GCC CCA GTT 432 
Gly Phe Ala As? Leu Met Gly Tyr lie Pro Val Val Gly Ala Pro Val 
120 135 140 



GGT GGT GTC GCC AGA GCT CTC GCG CAT GGC GTG AGA GTT CTG GAA GAC 
Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 
145 150 155 160 



480 
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GGG ATA AAC TAT GCA ACA GGG AAC TTG CCC GGT TGC TCC TTT TCT ATC 52 3 

Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser lie 
16S 170 175 

TTC TTA TTG GCC CTG CTA TCT TGT ATC ACT GTG CCG GTC TCC GGC TTG 57 S 

Phe Leu Leu Ala Leu Leu Ser Cys lie Thr Val Pro Val Ser Gly Leu 
180 laS 19Q 

CAG GTC AAG AAC ACC AGC AGC TCT TAC ATG GTA ACC AAT GAC TGC CAG 624 
Gin Val Lys Asn Thr Ser Ser Ser Tyr Met Val Thr Asn Asp Cys Gin 
195 200 205 

AAC AGT AGC ATC GTC TGG CAG CTC AGG GAT GCT GTT CTT CAC GTC CCC 6 72 

Asn Ser Ser lie Val Trp Gin. Leu Arg Asp Ala Val Leu His Val Pro 
210 215 220 

GGG TGT GTC CCT TGT GAG GAG AAG GGC AAC ATA TCC CGC TGT TGG ATA 72 0 

Gly Cys Val Pro Cys Glu Glu Lys Gly Asn lie Ser Arg Cys Trp lie 
225 230 235 240 

CCG GTT TCG CCC AAT ATA GCT GTG AGC CAA CCT GGT GCG CTT ACC AAG 76 3 

Pro Val Ser Pro Asn lie Ala Val Ser Gin Pro Gly Ala Leu Thr Lys 
245 2S0 255 

GGC CTG CGG ACG CAT ATT GAT ACC ATC ATT GCA TCC GCT ACG TTT TGC 816 
Gly Leu Arg Thr His lie Asp Thr lie lie Ala Ser Ala Thr Phe Cys 
260 265 270 

TCT GCC CTG TAC ATA GGA GAC CTG TGT GGC GC3 GTG ATG TTG GCT TCT 85 4 

Ser Ala Leu Tyr He Gly Asp Leu Cys Gly Ala Val Met Leu Ala Ser 
275 230 235 

CAA GTC TTC ATC ATC TCG CCC CAG CAT CAT AAG TTT GTC CAG GAC TGC 912 
Gin Val Phe He He Ser Pro Gin His His Lys Phe Val Gin Asp Cys 
290 295 300 

AAC TGT TCC ATA TAC CCA GGC CAC ATC ACT GGA CAT CGG ATG GCG 9 57 

Asn Cys Ser He Tyr Pro Gly His He Thr Gly His Arg Met Ala 
305 310 315 



(2) INFORMATION FOR SEQ ID NO: 144: 

(i) SEQUENCE C-iARACTERISTICS : 

(A) LENGTH: 319 ammo acids 

(B) TYPE: amino ac-d 
(D} TOPOLOGY: line:i.r 

(ii) MOLECULE TYPE: protein 

(xi) SEQL-ENCE DESCRIPTION': SEQ ID NO: 144: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
15 10 15 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He Val Gly 
20 25 30 
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Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
SO 55 SO 

He Pro Lys Asp Arg Arg Pro Thr Gly Lys Ser Trp Gly Lys Pro Gly 
65 70 75 QQ 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Leu Gly Trp Ala Gly Trp 
35 90 95 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro 
100 105 110 

Arg His Arg Ser Arg Asn Leu Gly Lys Val He Asp Thr Leu Thr Cys 
12.5 120 125 

Gly Phe Ala Asp Leu Met Gly Tyr He Pro Val Val Gly Ala Pro Val 
130 135 

Gly Gly Val Ala Arg Ala Leu Ala Kis Gly Val Arg Val Leu Glu Asp 

150 155 160 

Gly He Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He 
165 170 175 

Phe Leu Leu Ala Leu Leu Ser Cys He Thr Val Pro Val Ser Gly Leu 
lao 135 190 

Gin Val Lys Asn Thr Ser Ser Ser Tyr Met Val Thr Asn Asp Cys Gin 
195 200 205 

Asn Ser Ser He Val Trp Gin Leu Arg Asp Ala Val Leu His Val Pro 
210 215 ^ 220 

Gly Cys Val Pro Cys Glu Glu Lys Gly Asn He Ser Arg Cys Trp He 
225 230 235 240 

Pro Val Ser Pro Asn He Ala Val Ser Gin Pro Gly Ala Leu Thr Lys 
245 250 255 

Gly Leu Arg Thr His He Asp Thr He He Ala Ser Ala Thr Phe Cys 
260 265 270 

Ser Ala Leu Tyr He Gly Asp Leu Cys Gly Ala Val Met Leu Ala Ser 
275 280 295 

Gin Val Phe He He Ser Pro Gin His His Lys Phe Val Gin Asp Cys 
290 295 300 

Asn Cys Ser He Tyr Pro Gly Hxs He Thr Gly His Arg Met Ala 
305 310 315 

(2) INFORMATION FOR SEQ ID NO: 1*15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECUX-E TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 2.. 337 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 2.. 340 



(xi) SEQUENCE DESCRIPTION: SEQ ID NC : 145: 

C TCA ACG GTC ACG GAG AGG GAG ATC AGA ACT GAG GAG TCC ATA TAC 4d 
Ser Thr Val Thr Giu Arg Asp lie Arg Thr Glu Glu Ser lie Tyr 
15 10 15 

CTT GCT TGC TCT TTA CCC GAG CAG GCA CGG ACT GCC ATA CAC TCA CTG 94 
Leu Ala Cys Ser Leu Pro Glu Gin Ala Arg Thr Ala lie His Ser Leu 
20 25 30 

ACT GAG AGG CTT TAC GTG GGA GGG CCC ATG CTA AAC AGC AAA GGG CAA 14 2 

Thr Glu Arg Leu Tyr Val Gly Gly Pro Met Leu Asn Ser Lys Gly Gin 
35 40 45 

ACC TGC GGA TAC AGA CGC TGC CGC GCC AGC GGA GTG TTC ACC ACT AGC 190 
Thr Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Phe Thr Thr Ser 
50 55 60 

ATG GGA AAT ACC ATC ACG TGC TAC GTG AAG GCA CAA GCA GCC TGT AAG 23 3 

Met Gly Asn Thr lie Thr Cys Tyr Val Lys Ala Gin Ala Ala Cys Lys 
65 70 75 

GCT GCG GGC ATA ATT GCC CCC ACG ATG CTG GTG TGC GGC GAC GAT CTA 286 
Ala Ala Gly lie lie Ala Pro Thr Mer Leu Val Cys Gly Asp Asp Leu 
80 85 9C 95 

GTT GTC ATC TCA GAG AGT CAG GGG ACC GAG GAG GAC GAG CGG AAC CTA 3 34 

Val Val He Ser Glu Ser Gin Gly Thr Glu Glu Asp Glu Arg Asn Leu 
100 105 110 

CGA GCC 34 0 

Arg Ala 



(2) INFORMATION FOR SEQ ID NO: 146: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 113 amino acids 

(B) TYPE: amano acid 
(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146: 

Ser Tiir Val Thr Glu Arg Asp Ila Arg Thr Glu Glu Ser He Tyr Leu 
15 10 15 

Ala Cys Ser Leu Pro Glu Gin Ala Arg Thr Ala He His Ser Leu Thr 
20 25 30 

Glu Arg Leu Tyr Val Gly Gly Pro Met Leu Asn Ser Lys Gly Gin Thr 
35 40 45 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Phe Thr Thr Ser Met 
50 55 60 

Gly Asn Thr He Thr Cys Tyr Val Lys Ala Gin Ala Ala Cys Lys Ala 
65 70 73 80 

Ala Gly He He Ala Pro Thr Met Leu Val Cys Gly Asp Asp Leu Val 
85 90 95 

Val He Ser Glu Ser Gin Gly Thr Glu Glu Asp Glu Arg Asn Leu Arg 
100 105 110 



Ala 



(2) INFORMATION FOR SEQ ID NO: 147: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 345 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1. .345 



(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 
(3) LOCATION: 1 . . 342 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 
ATG AGC ACA CTT CCT AAA CCA CAA AGA AAA ACC AAA AGA AAC ACC AAC 
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Met: Ser Ttir Leu Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
15 10 15 

CCC GGC CAC AGG ACG TTA AGT TCC CAG GCG GCG GTC AGA TCG TTG GTG 96 
Pro Gly Hxs Arg Thr Leu Ser Ser Gin Ala Ala Val Arg Ser Leu Val 
20 25 30 

GAG TTT ACG TGC TAC C^C GCA GGG GCC CCC AGT TGG GTG TGC GTG CAG 144 
Glu Phe Thr Cys Tyr Hxs Ala Gly Ala Pro Ser Trp Val Cys Val Gin 
35 40 45 

TGC GCA AGA CTT CCG AGC GGT CGC AAC CTC GCA GTA GGC GCC AAC CCA 192 
Cys Ala Arg Leu Pro Ser Gly Arg Asn Leu Ala Val Gly Ala Asn Pro 
50 55 60 

TCC CCA GGG CGC GCC GAA CCG AGG GCA GGT CCT GGG CTC AGC CCG GGT 24 0 

Ser Pro Gly Arg Ala Glu Pro Arg Ala Gly Pro Gly Leu Ser Pro Gly 
65 70 75 80 

ACC CTT GGC CCC TAT ATG GGA ATG AGG GCT GCG GGT GGG CAG GGT GGC 23 5 

Thr Leu Gly Pro Tyr Met Gly Mec Arg Ala Ala Gly Gly Gin Gly Gly 
85 90 95 

TCC TGT CCC CGC GCG GCT CTC GCC CGT CGT GGG GCC CAA ATG ACC CCC 3 36 

Ser Cys Pro Arg Ala Ala Leu Ala Arg Arg Gly Ala Gin Met: Thr Pro 
100 1C5 110 

GGC GCA GGA 34 5 

Gly Ala Gly 
115 



(2) INFORMATION FOR SEQ ID NO: 148: 

(i) SEQUENCE CKAi?ACTERISTICS : 
(A) LENGTH: 115 amino acids 
(3) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SHQ ID NO: 148: 
Met Ser Thr Leu Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
15 10 15 

Pro Gly His Arg Thr Leu Ser Ser Gin Ala Ala Val Arg Ser Leu Val 
20 25 30 

Glu Phe Thr Cys Tyr His Ala Gly Ala Pro Ser Trp Val Cys Val Gin 
35 40 45 

Cys Ala Arg Leu Pro Ser Gly Arg Asn Leu Ala Val Gly Ala Asn Pro 
50 55 60 

Ser Pro Gly Arg Ala Glu Pro Arg Ala Gly Pro Gly Leu Ser Pro Gly 
65 70 75 80 

Thr Leu Gly Pro Tyr Met Gly Met Arg Ala Ala Gly Gly Gin Gly Gly 
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85 90 95 

Ser Cys Pro Arg Ala Ala Leu Ala Arg Arg Gly Ala Gin Met Thr Pro 
100 105 110 

Gly Ala Gly 
115 

(2) INFORMATION FOR SEQ ID NO : 14 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2.. 280 

(ix) FEATURE: 

(A) NAME/KEY: mat:_pept ide 

(B) LOCATION: 2.. 277 



{xi) SEQUENCE DESCRIPTION: SEQ ID NO : 14 9: 

G GCC TGT GAC CTC AAG GAC GAG GCT AGG AGG GTG ATA ACT TCA CTC 
Ala Cys Asd Leu Lys Asp Glu Ala Arg Arg Val lie Thr Ser Leu 
1*5* 10 15 

ACG GAG CGG CTT TAG TGT GGT GGT CCT ATG TTC AAC AGO AAG GGA CAA 
Thr Glu Arg Leu Tyr Cys Gly Gly Pro Met Phe Asn Ser Lys Gly Gin 
20 25 30 

CAC TGC GGT TAG CGC CGC TGC CGT GCT AGT GGG GTG CTA CCC ACC AGC 
His Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Pro Thr Ser 
35 40 45 

TTC GGG AAC ACA ATC ACC TGT TAG ATC AAA GCA AAG GCA GCT ACC AAA 
Phe Gly Asn Thr He Thr Cys Tyr He Lys Ala Lys Ala Ala Thr Lys 
50 55 60 

GCT GCC GGA ATT AAA AAT CCA TCA TTC CTT GTC TGC GGA GAT GAC TTG 
Ala Ala Gly He Lys Asa Pro Ser Phe Leu Val Cys Gly Asp Asp- Leu 
65 ^ 70 75 

GTC GTG ATT GCT GAG AGT GCA GGG ATC GAT GAG GAC AGA GCG 
Val Val He Ala Glu Ser Ala Gly He Asp Glu Asp Arg Ala 
80 85 90 

(2) INFORMATION FOR SEQ ID NO: 150: 

{i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 9 3 amino acids 



46 



14: 



190 



238 



230 



SUBSTITUTE SHEET (RULE 26) 



wo 94/25601 



200 



i i} '^4 I? iHn - 43 -H' 1 «J S,:^ 
PCT/EP94/01323 



(B) TTPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: procein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 

Ala Cys Asp Leu Lys Asp Glu Ala Arg Arg Val lie Thr Ser Leu Tiir 
15 10 15 

Glu Arg Leu Tyr Cys Gly Gly Pro Met Phe Asn Ser Lys Gly Gin His 
20 25 30 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Pro Thr Ser Phe 
35 40 45 

Gly Asn Thr He Thr Cys Tyr He Lys Ala Lys Ala Ala Thr Lys Ala 
50 55 60 

Ala Gly He Lys Asn Pro Ser Phe Leu Val Cys Gly Asp Asp Leu Val 
65 70 75 80 

Val He Ala Glu Ser Ala Gly He Asp Glu Asp Arg Ala 
85 90 

(2) INFORMATION FOR SSQ ID NO: 151: 

(i) SEQL-ENCE CHARACTERISTICS: 

(A) LENGTH: 4 99 base pairs 
CB) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
ID) TOPOLOGY: linear 

(ii) MOLECULE T^^PE : cDNA 

(iii) HYPOTHETICAL: NO 
(iii) ANTI -SENSE: NO 



FEATURE: 

(A) NAME/KEY: 
(3) LOCATION: 

FEATURE : 

(A) NAME/KEY: 

(B) LOCATION: 



CDS 

1. .499 



mat__peptide 
1 . .496 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 

ATG AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC AAA AGA AAC ACC AAC 
Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
15 10 15 

CGT CGC CCA CAG GAC GTC AAG TTC CCG GGC GOT GGT CAG ATC GTT GGC 
Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He Val Gly 
20 25 30 
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GGA GTT TAC TTG TTG CCG CGC AGG GGC CCT AGG ATG GGT GTG CGC GCG 144 
Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Met Gly Val Arg Ala 
35 40 45 

ACT CGG AAG ACT TCG GAA CGG TCG CAA CCC CGT GGA CGG CGT CAG CCT 192 
Thr Arg Lys Thr Ser Glu Arg Ser Gin. Pro Arg Gly Arg Arg Gin Pro 

30 55 60 

ATT CCC AAG GCG CGC CAG CCC ACG GGC CGG TCC TGG GGT CAA CCC GGG 240 
lie Pro Lys Ala Arg Gin Pro Thr Gly Arg Ser Trp Gly Gin Pro Gly 
65 70 75 30 

TAC CCT TGG CCC CTT TAC GCC AAT GAG GGC CTC GGG TGG GCA GGG TGG 28 8 

Tyr Pro Trp Pro Leu Tyr Ala Asn Glu Gly Leu Gly Trp Ala Gly Trp 
85 90 95 

CTG CTC TCC CCT CGA GGC TCT CGG CCT AAT TGG GGC CCC AAT GAC CCC _ 3 36 
Leu Leu Ser Pro Arg Gly Ser Arg Pro Asn Trp Gly Pro Asn Asp Pro 
100 105 110 

CGG CGA AAA TCG CGT AAT TTG GGT AAG GTC AT" GAT ACC CTA ACG TGC 3 84 

Arg Arg Lys Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys 
115 120 125 

GGA TTC GCC GAT CTC ATG GGG TAT ATC CCG CTC GTA GGC GGC CCC ATT 43 2 

Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Gly Pro lie 
130 135 140 

GGG GGC GTC GCA AGG GCT CTC GCA CAC GGT GTG AGG GTC CTT GAG GAC 48 0 

Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 
145 150 155 160 

GGG GTA AAC TAT GCA ACA G 4 99 

Gly Val Asn Tyr Ala Thr 
165 



(2) INFORMATION FOR SEQ ID NO: 152: 

(i) SEQUENCE CKARACTEHISTICS : 

(A) LENGTH: 166 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESC^IIPTION: SEQ ID NO: 152: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
15 10 15 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Met Gly Val Arg Ala 
35 40 45 
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Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 

50 55 60 

lie Pro Lys Ala Arg Gin Pro Thr Gly Arg Ser Trp Gly Gin Pro Gly 
65 70 73 80 

Tyr Pro Trp Pro Leu Tyr Ala Asn Glu Gly Leu Gly Trp Ala Gly Trp 
as 90 95 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Asn Trp Gly Pro Asn Asp Pro 
100 105 110 

Arg Arg Lys Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys 
115 120 125 

Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Gly Pro He 

130 135 140 

Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 

145 150 155 160 

Gly Val Asn Tyr Aia Thr 
1S5 

(2) INFORMATION FOR SEQ Ii: NO : 153: 

(i) SEQUENCE CiARACTEHrSTICS : 

(A) LENGTH: 57 9 base pairs 
(3) TYPE: nucleic acid 
(CI STRANDEDNES3 : Single 
(D) TOPOLOGY: linear 

(ii) MOLECULE r/PE : cD:;a 

(iii) ir^POTKETICAL: NO 

(iii) ANTI-SENSE: NO 



FEATURE : 

(A) NAME/KEY: 

(B) LOCATION: 

FEATURE: 

(A) NAME/KEY; 
(3) LOCATION: 



CDS 

1. .579 



matjeptide 
1. .576 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 

ACG TGC GGA TTC GCC GAT CTC ATG GGG TAG ATC CCG CTC GTA GGC GGC 4 3 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Gly 
15 10 15 

CCC GTT GGG GGC GTC GCA AGG GOT- CTC GCA CAC GGT GTG AGG GTC CTT 96 
Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu 
20 25 30 

GAG GAC GGG GTA AAC TAT. CCA ACA GGG AAT TTA CCC GGT TGC TCT TTC 14 4 
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Glu Asp Gly Val Asn Tyr Pro Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

TCT ATC TTT ATT CTT GCT CTT CTC TCG TGT CTG ACC GTT CCG GCC TCT 192 
Ser lie Phe He Leu Ala Leu Leu Ser Cys Leu Tiir Val Pro Ala Ser 
50 55 60 

GCA GTT CCC TAC CGA AAT GCC TCT GGG ATT TAT CAT GTT ACC AAT GAT 24 0 

Ala Val, Pro Tyr Arg Asn Ala Ser Gly He Tyr His Val Thr Asn Asp 
55 70 75 SO 

TGC CCA AAC TCT TCC ATA GTC TAT GAG GCA GAT AAC CTG ATC CTA CAC 238 
Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp Asn Leu He Leu His 
85 90 95 

GCA CCT GGT TGC GTG CCT TGT GTC ATG ACA GGT AAT GTG AGT AGA TGC 33 6 

Ala Pro Gly Cys Val Pro Cys Val Met Thr Gly Asn Val Ser Arg Cys 
100 105 110 

TGG GTC CAA ATT ACC CCT ACA CTG TCA GCC CCG AGC CTC GGA GCA GTC 3 84 

Trp Val Gin He Thr Pro Thr Leu Ser Ala Pro Ser Leu Gly Ala Val 
115 120 125 

ACG GCT CCT CTT CGG AGA GCC GTT GAC TAC CTA GCG GGA GGG GCT GCC 432 
Thr Ala Pro Leu Arg Arg Ala Val Asp Tyr Leu Ala Gly Gly Ala Ala 
130 135 140 

CTC TGC TCC GCG TTA TAC GTA GGA GAC GCG TGT GGG GCA CTA TTC TTG 430 
Leu Cys Ser Ala Leu Tyr Val Gly Asp Ala Cys Gly Ala Leu Phe Leu 
145 150 155 ISO 

GTA GCC CAA ATG TTC ACC TAT AGG CCT CGC CAG CAC GCT ACG GTG CAC 5 23 

Val Gly Gin Met Phe Th^ Tyr Arg Pro Arg Gin His Ala Thr Val Gin 
165 170 175 

AAC TGC AAC TGT TCC ATT TAC AGT GGC CAT GTT ACC GGC CAC CGG ATG 5 75 

Asn Cys Asn Cys Ser He Tyr Ser Gly Kis Val Thr Gly Kis Arg Met 
130 135 190 

GCG 579 
Ala 



(2) INFORMATION FOR SEQ ID NO: 154: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 3 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Gly 
15 10 15 
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Pro Val Gly Gly Vai Ala Arg Ala Leu Ala His Gly Val Arg Val Leu 
20 25 30 

Glu Asp Gly Val Asn Tyr Pro Thx Gly Asia Leu Pro Gly Cys Ser Phe 
35 40 45 

Ser lie Phe lie Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
SO 55 60 

Ala Vai Pro Tyr Arg Asn Ala Ser Gly lie Tyr His Val Thr Asa Asp 
65 70 75 80 

Cys Pro Asn Ser Ser lie Val Tyr Glu Ala Asp Asn Leu lie Leu His 
95 90 95 

Ala Pro Gly Cys Val Pro Cys Val Met. Thr Gly Asn Val Ser Arg Cys 
100 105 110 

Trp Val Gin He Thr Pro Thr Leu Ser Ala Pre Ser Leu Gly Ala Val 
115 120 125 

Thr Ala Pro Leu Arg Arg Ala Val Asp T^/r Leu Ala Gly Gly Ala Ala 
130 135 140 

Leu Cys Ser Ala Leu Tyr Val Gly Asp Ala Cys Gly Ala Leu Phe Leu 
145 150 155 150 

Val Gly Gin Met Phe Thr Tyr Arg Pro Arg Gin His Ala Thr Val Gin 
165 170 175 

Asn Cys Asn Cys Ser lie Tyr Ser Gly His Val Thr Gly His Arg Met 
180 135 190 

Ala 



(2) INF0HJ4ATI0N FOR SEQ ID NO: 155: 

(i) SEQUENCE CHARACTER. I ST I CS : 

(A) LENGTH: 57 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(il) MOLECULE TYPE: cDNA 
(iil) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



FEATURE : 

(A) NAME/KEY: 
{3) LOCATION: 

FEATURE: 

(A) NAME/KEY: 

(B) LOCATION: 



CDS 

1 . . 579 



mat^paptide 
1 . . 575 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 

ACG TGC GGA TTC GCC GAC CTC GTG GGG TAC ATC CCG CTC GTA GGC GGC 43 
Thr Cys Gly Phe Ala Asp Leu Val Gly Tyr He Pro Leu Val Gly Gly 
15 10 15 

CCC GTT GGG GGC GTC GCA AGG GCT CTC GCA CAT GGT GTG AGG GTT CTT 96 
Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu 
20 25 30 

GAG GAC GGG GTG AAT TAT GCA ACA GGG AAT CTG CCT GGT TGC TCT TTC 144 
Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

TCT ATC TTC ATT CTT GCA CTT CTC TCG TGC CTC ACT GTC CCG GCC TCT 192 
Ser He Phe He Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 55 60 

GCA GTT CCC TAC CGA AAT GCC TCT GGG ATC TAT CAT GTC ACC AAT GAT 240 
Ala Val Pro Tyr Arg Asn Ala Ser Gly He Tyr His Val Thr Asn Asp 
65 70 75 80 

TGC CCA AAC TCT TCC ATA GTC TAT GAG GC\ GAT GAT CTG ATC CTA CAC 233 
Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp Asp Leu He Leu His 
85 90 95 

GCA CCT GGC TGC GTG CCT TGT GTC AGG AAA GAT AAT GTG AGT AGG TGC 336 
Ala Pro Gly Cys Val Pro Cys Val Arg Lys Asp Asn Val Ser Arg Cys 
100 105 110 

TGG GTC CAA ATT ACC CCC ACG CTG TCA GCC CCG AGC TTC GGA GCA GTC 3 84 

Trp Val Gin He Thr Pro Thr Leu Ser Ala Pro Ser Phe Gly Ala Val 
115 120 125 

ACG GCT CCC CTT CGG AGA GCC GTT GAT TAC TTG GTG GGA GGG GCT GCC 43 2 

Thr Ala Pro Leu Arg Arg Ala Val Asp Tyr Leu Val Gly Gly Ala Ala 
130 135 140 

CTC TGC TCC GCG TTA TAC GTT GGA GAC GCG TGT GGG GCA CTA TTT TTG 43 0 

Leu Cys Ser Ala Leu Tyr Val Gly Asp Ala Cys Gly Ala Leu Phe Leu 
145 150 155 160 

GTA GGC CAA ATG TTC ACC TAT AGG CCT CGC CAG CAT GCT ACG GTG CAG 5 28 

Val Gly Gin Met Phe Thr Tyr Arg Pro Arg Gin His Ala Thr Val Gin 
165 170 175 

GAC TGC AAC TGT TCC ATC TAC AGT GGC CAC GTC ACC GGC CAT CAG ATG 5 76 

Asp Cys Asn Cys Ser He Tyr Ser Gly His Val Thr Gly His Gin Me~ 
ISO . 135 190 



GCA 
Ala 



579 



(2) INFORMATION FOR SEQ ID NO: 155: 
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(i) SSQOENCS CHARACTERISTICS: 

(A) LENGTH: 193 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 156: 

Thr Cys Gly Phe Ala Asp Leu Vai Gly Tyr lie Pro Leu Val Gly Gly 
15 IC 15 

Pro Val Gly Gly Val ALa Arg Ala Leu Ala His Gly Val Arg val Leu 
20 25 30 

Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phs 
35 40 45 

Ser lie Phe lie Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 55 60 

Ala Val Pro Tyr Arg Asn Ala Ser Gly lie Tyr H^s Val Thr Asn As? 
65 70 75 80 

Cys Pro Asn Ser Ser lie Val Tyr Glu Ala Asp Asp Leu lie Leu Kis 
85 90 95 

Ala Pro Gly Cys Val Pro Cys Val Arg Lys Asp Asn Val Ser Arg Cys 
100 105 110 

Trp Val Gin He Thr Pro Thr Leu Ser Ala Pro Ser Phe Gly Ala Val 
lis 120 125 

Thr Ala Pro Leu Arg Arg Ala Val Asp Tyr Leu Val Gly Gly Ala Ala 
130 135 140 

Leu Cys Ser Ala Leu Tyr Val Gly Asp Ala Cys Gly Ala Leu Phe Leu 
145 150 155 loO 

Val Gly Gin Met Phe Thr Tyr Arg Pro Arg Gin His Ala Thr Val Gin 
165 170 175 

Asp Cys Asn Cys Ser He Tyr Ser Gly His Val Thr Gly His Gin Met 
180 135 190 

Ala 



(2} INFORMATION FOR SEQ ID NO: 157: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 0 base pairs 
(Bl TYPE: nucleic acid 
(CI STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
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(iii) HYPOTHETICAL: NO 
(iii) ANTI-SHNSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 3., 530 

(ix) FEATURE: 

(A) NAME/KEY: mat:_peptide 

(B) LOCATION: 3 . .527 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157: 

CA CCT ACS ACA GCT CTG CTG GTG GCC CAG TTA CTG CGG ATT CCC CAA ' 4 7 

Pro Thr Thr Ala Leu Leu Val Ala Gin Leu Leu Arg lie Pro Gin 
15 ID 1= 

GTG GTC ATT GAC ATC ATC GCA GGG AGC CAC TGG GGG GTC TTG TTT GCC 9 5 

Val Val lie Asp lie lie Ala Gly Ser His Trp Gly Val Leu Phe Ala 
20 25 30 

GCC GCA TAC TAT GCA TCG GTG GCT AAC TGG ACC AAG GTC GTG CTG GTC 14 3 

Ala Ala Tyr Tyr Ala Ser Val Ala Asn Trp Thr Lys Val Val Leu Val 
35 40 45 

TTG TTT CTG TTT GCA GGG GTT GAT GCT ACT ACC CAG ATT TCG GGC GGC 191 
Leu Phe Leu Phe Ala Gly Val Asp Ala Thr Thr Gin lie Ser Gly Gly 
50 55 60 

TCC AGC GCC CAA ACG ACG TAT GGC ATC GCC TCk TTT ATC ACC CGC GGC 23 9 

Ser Ser Ala Gin Thr Thr Tyr Gly lie Ala Ser Phe He Thr Arg Gly 
65 70 75 

GCG CAG CAG AAA CTG CAG CTC ATA AAT ACC AAC GGA AGC TGG CAC ATC 23 7 

Ala Gin Gin Lys Leu Gin Leu He Asn Thr Asn Gly Ser Trp H:ls He 
80 85 90 S3 

AAC AGG ACC GCC CTT AAT TGT AAT GAC AGC CTC CAG ACT GGG TTC ATA 33 5 

Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser Leu Gin Thr Gly Phe He 
100 105 110 

GCC GGC CTC TTC TAC TAC CAT AAG TTC AAC TCT TCT GGA TGC CCG GAT 38 3 

Ala Gly Leu Phe Tyr Tyr Kis Lys Phe Asn Ser Ser Gly Cys Pro Asp 
115 120 125 

CGG ATG GCT AGC TGT AGG GCC CTT GCC ACT TTT GAC CAG GGC TGG GGA 431 
Arg Met Ala Ser Cys Arg Ala Leu Ala Thr Phe Asp Gin Gly Trp Gly 
130 135 140 

ACT ATC AGC TAT GCC AAC ATA TCG "GOT CCC ACT GAT GAC AAA CCA TAT 47 9 

Thr He Ser Tyr Ala Asn He Ser Gly Pro Ser Asp Asp Lys Pro T\'r 
145 150 155 

TGC TGG CAC TAT CCC CCA CGG CCG TGC GGA GTG GTG CCA GCC CAA GAG 52 
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Cys Trp His Tyr Pro Pro Arg Pro Cys Gly Val Val Pro Ala Gin Giu 
leo 1S5 170 

GTC 33 
Val 



(2) INFORMATION FOR SEQ ID NO: 153: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 176 amino acids 

(B) TYPE: amine acid 
CD) TOPOLOGY: . linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 153: 

Pro Thr Thr Ala Leu Leu Val Ala Gin Leu Leu Arg He Pro Gin Val 
i 5 10 IS 

Val He Asp He He Ala Gly Ser His Trp Gly Val Leu Phe Ala Ala 
20 25 30 

Ala Tyr Tyr Ala Ser Val Ala Asn Trp Thr Lys Val Val Leu Val Leu 
35 40 45 

Phe Leu Phe Ala Gly Val Asp Ala Thr Thr Gin He Ser Glv Gly Ser 
50 55 60 ^ ^ 

Ser Ala Gin Thr Thr Tyr Gly He Ala Ser Phe He Thr Arg Gly Ala 
^5 70 75 80 

Gin Gin Lys Leu Gin Leu He Asn Thr Asn Gly Ser Tr? Kis He Asn 
85 90 95 

Arg Thr Ala Leu Asn Cys Asn Asp Ser Leu Gin Thr Gly Phe He Ala 
100 105 110 

Gly Leu Phe Tyr Tyr His Lys Phe Asn Ser Ser Gly Cys Pro Asp Arg 
115 120 125 

Met Ala Ser Cys Arg Ala Leu Ala Thr Phe Asp Gin Gly Trp Gly Thr 
130 135 140 

He Ser Tyr Ala Asn He Ser Gly Pro Ser Asp Asp Lys Pro Tyr Cys 
1^5 150 155 160 

Trp His Tyr Pro Pro Arg Pro Cys Gly Val Val Pro Ala Gin Glu Val 
165 170 175 



(2) INFORMATION FOR SEQ ID NO: 15 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS ; single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 
(Hi) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2.. 340 

(ix) FEATURE: 

(A) NAME/KEY: mac_peptide 

(B} LOCATION: 2.. 337 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 15 9: 



C TCG ACC GTT ACC GAA CX7 GAC ATA ATG ACC GAA GAG TCC ATT TAC 
Ser Thr Val Thr Glu Hz,s Asp lie Met Thr Gl'a Glu Ser lie T>': 
15 10 1^ 



46 



CAA TCA TGT GAC TTG CAG CCC GAG GCA CGC GCA GCA ATA CGG TCA CTC 
Gin Ser Cys Asp Leu Gin Pro Glu Ala Arg Ala Ala lie Arg Ser Leu 
20 25 30 



94 



ACC CAA CGC CTC TAC TGT GGA GGC CCC ATG TAC AAC AGO AAG GGG CAA 
Thr Gin Arg Leu Tyr Cys Gly Gly Pro Met Tyr Asn Ser Lys Gly Gin 
35 40 45 



142 



CAG TGT GGT TAT CGC AGA TGC CGC GCC AGC GGC GTT TTC ACC ACC AGT 
Gin Cys Gly Tyr Arg Arc Cys Arg Ala Ser Gly Val Phe Thr Thr Ser 
50 55 60 



190 



ATG GGC AAC ACC ATG ACG TGC TAC ATC AAG GOT TTA GCC TCC TGT AGA 
Met Gly Asn Thr Met Thr Cys Tyr lie Lys Ala Leu Ala Ser Cys Arg 
65 70 75 



238 



GCC GCA AGG CTC CGG GAC TGC ACG CTC CTG GTG TGT GGT GAC GAT CTT 
Ala Ala Arg Leu Arg Asp Cys Thr Leu Leu Val Cys Gly Asp Asp Leu 
80 85 90 95 



286 



GTG GCC ATC TGC GAG AGC CAG GGG ACA CAC GAG GAT GAA GCA AGC CTG 
Val Ala He Cys Glu Ser Gin Gly Thr His Glu Asp Glu Ala Ser Leu 
100 105 110 



334 



AGA GCC 
Arg Ala 



340 



(2) INFORMATION FOR SEQ 13 NO. ISO: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 113 amino acids 
- (B) TYPE: ammo acid 
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(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION; SEQ ID NO: ISO: 



Ser Thr Val Tiir 



Ser Cys Asp Leu 
20 

Gin Arg Leu Tyr 
35 

Cys Gly Tyr Arg 
50 

Gly Asn Thr Mec 
65 

Ala Arg Leu Arg 



Ala He Cys Glu 
100 



Glu His Asp He 
5 

Gin Pro Glu Ala 



Cys Gly Gly Pro 
40 

Arg Cys Arg Ala 
55 

Thr Cys Tyr lie 
70 

Asp Cys Thr Leu 
95 

Ser Gin Gly Thr 



Mec Thr Glu Glu 
10 

Arg Ala Ala He 
25 

Met Tyr Asn Ser 



Ser Gly Val Phe 
60 

Lys Ala Leu Ala 
75 

Leu Val Cys Gly 
90 

His Glu Asp Glu 
105 



Ser lie Tyr Gin 



Arg Ser Leu Thr 
30 

Lys Gly Gin Gin 
45 

Thr Thr Ser Met 



Ser Cys Arg Ala 
SO 

Asp Asp Leu Val 
95 

Ala Ser Leu Arg 
110 



Ala 



(2) INFORMATION FOR SEQ ID NO : 161: 

ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2,. 340 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 2.. 337 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 



C TCA ACC GCC ACC GAA CAT GAC ATA TTG ACT GAA GAG TCC ATA TAC 4 6 

Ser Thr Ala Thr Glu His Asp He Leu Thr Glu Glu Ser He Ty- 
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15 10 15 

CAA TCA TGT GAC TCG CAG CCC GAG GCA CGC GCA GCA ATA CGG TCA CTC 

Gin Ser Cys Asp Ser Gin Pro Asp Ala Arg Ala Ala lie Arg Ser Leu 

20 25 30 

•ACC CAA CGC TTG TTC TGT GGA GGC CCC ATG TAT AAC AGC AAG GGG CAA 

Thr Gin Arg Leu Phe Cys Gly Gly Pro Met Tyr Asn Ser Lys Gly Gin 
35 40 45 



ACC GCT GGG CTC CGG GAC TAC ACG CTC CTG GTG TGT GGT GAC GAT CAT 
Thr Ala Gly Leu Ajrg Asp Tyr Thr Leu Leu Val Cvs Glv Asp Asp s 
80 85 90 ' ' " " ^5 



AGA GCC 
Arg Ala 



(2) INFORMATION FOR SEQ ID NO: 152: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 113 amino acids 
(3) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1S2: 

Ser Thr Ala Thr Glu His Asp lie Leu Thr Glu Glu Ser lie Tyr Gin 
15 10 xs 

Ser Cys Asp Ser Gin Pro As? Ala Arg Ala Ala lie Arg Ser Leu Thr 

20 25 30 

Gin Arg Leu Phe Cys Gly Gly Pro Met Tyr Asn Ser Lys Gly Gin Gin 
35 40 45 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Phe Thr Thr Ser Met 
50 55 60 

Gly Asa Thr Met Thr Cys Tyr He- Lys Ala Leu Ala Ser Cys Arg Thr 
65 70 75 80 

Ala Gly Leu Arg Asp Tyr Thr Leu Leu Val Cys Gly Asp Asp His Val 
85 90 " 95 



94 



142 



CAA TGT GGT TAT CGC AGA TGC CGC GCC AGC GGC GTC TTC ACC ACC AGT 190 

Gin Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Phe Thr Thr Ser 
50 55 60 

ATG GGC AAC ACC ATG ACG TGC TAC ATT AAG GCT TTA GCC TCC TGT AGA 23 8 

Met Gly Asn Thr Met Thr Cys Tyr He Lys Ala Leu Ala Ser Cys Arg 
65 70 75 



286 



GTG GCC ATC TGC GAG AGC CAG GGG ACA CAC GAG GAT GAA GCG AAC CTG 3 34 

Val Ala He Cys Glu Ser Gin Gly Thr K:.s Glu As? Glu Ala Asn Leu 
100 los 110 



340 
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Ala He Cys Glu Ser Gin Gly Thr His Glu Asp Glu Ala Asn Leu Arg 
100 105 1X0 

Ala 



(2) INrORMATICN FOR SEQ ID NC : 153: 

(i) SEQUHNC2 CHARACTZRISTICS : 

(A) LENGTH: 4 99 base pairs 

(B) TYPE: nucleic acid 

(C) STRAM)EDNESS : Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iil) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



dx] FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: l.,4 99 

(ix) FEATURE: 

(A) NAiME/KEY: mar_pept:ide 

(B) LOCATION: 1. .4 96 



(xi) SEQL'ENCE DESCRIPTION: SEQ ID NO: 153: 

ATG AGC ACQ AAT CCT AAA CTT CAA AGA AAA ACC AAA CGT AAC ACC AAC 4 3 

Met Ser Thr Asn Pro Lys Leu Gin Arg Lys Thr Lys Arg Asn Thr Asn 
15 10 15 

CGC CGC CCC ATG GAC GTT AAG TTC CCG GGT GGT GGC GAG ATC GTT GGC 95 

Arg Arg Pro Met Asp Val Lys Phe Pro Gly Gly Gly Gin He Val Gly 
20 25 30 

GGA GTT TAG TTG TTG CCG CGC AGG GGC CCT AGG TTG GGT GTG CGC GCG 14 4 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

ACT CGG AAG ACT TCG GAG CGG TCG CAA CCT CGT GGG AGG CGC CAA CCT 192 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

ATC CCC AAG GCG CGC CGA TCC GAG GGC AGA TCC TGG GCG CAG CCC GGG 24 0 

lie Pro Lys Ala Arg Arg Ser Glu Gly Arg Ser Trp Ala Gin Pro Gly 
65 70 75 80 

TAT CCT TGG CCC CTT TAC GGC AAT -GAG GGC TGT GGG TGG GCA GGG TGG 29 S 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
85 90 95 

CTC CTG TCC CCT CGC GGG TCT CGG CCG TCT TGG GGC CCT AAT GAT CCC 3 35 
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Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Tr? Gly Pro Asn Asp Pro 
100 105 110 

CGG CGG AGG TCC CGC AAC CTG GGT AAG GTC ATC GAT ACC CTA ACA TGC 384 
Arg Arg Arg Ser Arg Asn Leu Gly Lys Val He Asp Thr Leu Thr Cys 
115 120 125 

GGC TTC GCC GAG CTC ATG GGA TAC ATC CCG CTT GTA GGC GCC CCC GTG 432 
Gly Phe Ala Asp Leu Met: Gly Tyr lie Pro Leu Val Gly Ala Pro Val 
130 ' 135 140 

GGT GGC GTC GCC AGA GCC CTG GCA CAC GGT GTT AGG GCT GTG GAA GAC 480 
Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val Glu Asp 
145 150 155 150 

GGG ATC AAC TAC GCA ACA G 499 
Gly He Asn Tyr Ala Thr 
165 



(2) INFORMATION FOR SEQ ID NO : 164: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 16 5 amino acids 
(3) TYPE: amine ac:Ld 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 164: 

Met Ser Thr Asn Pro Lys Leu Gin Arg Lys Thr Lys Arg Asn Thr Asn 
15 10 15 

Arg Arg Pro Met Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

lie Pro Lys Ala Arg Arg Ser Glu Gly Arg Ser Trp Ala Gin Pro Gly 
65 70 75 80 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
85 90 95 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Asn Asp Pro 
100 105 110 

Arg Arg Arg Ser Arg Asn Leu Gly Lys Val He Asp Thr Leu Thr Cys 
115 120 - 125 

Gly Phe Ala Asp Leu Mec Gly Tyr He Pro Leu Val Gly Ala Pro Val 
130 ' 135 140 
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Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val Glu Asp 
150 155 160 

Gly He Asn Tyr Ala Thr 
165 

(2) INFORMATION FOR S£Q ID NC : 155: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LHNGTH: 4 99 base pairs 

(B) TY?H: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) AOTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155: 

ATGAGCACGA ATCCTAAACC TCAAAGAAAA ACCAAACGTA ACACCAACCG CCGCCCTATG 60 

GACGTTAAGT TCCCAGGCGG TGGTC^VGATC GTTGGCGGAG TTTACTTGTT GCCGCGCAGG 120 
GGCCCCAGGT TGGGTGTGCG CGCGACTCGG AAGACTTCGG AGCGGTCGC\ ACCTCGTGGG , 18 0 

AGGCGCCAAC CTATCCCCAA GGCGCGCCGA ACCGAGGGCA GATCCTGGGC GdkGCCCGGG 240 

TATCCTTGGC CCCTTTACGG CAATGAGGGC TGTGGGTGGG CAGGGTGGCT CCTGTCCCCT 3 00 

CGCGGNTCTC GGNCGTCTTG GGGCCCCAAT GATCCCCGGN GGAGATCCCG CAACTTGGGT 360 

AAGGTCATCG ATACCCTAAC ATGCGGCTTC GCCGACCTCA TGGGATACAT CCCGCTTGTA 420 

GGCGCCCCCG TGGGTGGCGT CGCCAGGGCC CTGGCACATG GTGTTAGGGC TGTGGAAGAC 48 0 

GGGATCAATT ATGCAACAG 499 
(2) INFORMATION FOR SEQ ID NC : 166: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 126 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 166: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
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15 10 IS 

Arg Arg Pro Mec Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
SO 55 60 

lie Pro Lys Ala Arg Arg Thr Glu Gly Arg Ser Trp Ala Gin Pro Gly 
65 70 75 80 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
85 90 95 

Leu Leu Ser Pro Arg Xaa Ser Arg Xaa Ser Trp Gly Pro Asn Asp Pro- 
100 105 110 

Arg Xaa Arg Ser Arg Asn Leu Gly Lys Val He Asp Thr Leu 

115 120 125 

(2) INFORMATION FOR SEQ ID NO: 1S7: 

(l) SEQUHNCH CHARACTERISTICS: 

(A) LENGTH: 579 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(iii) HYPOTHETICAL: NO 
{iii} ANTI-SENSE: NO 



FEATL-RE : 

(A) NAME/KEY: 

(B) LOCATION: 

FEATURE : 

(A) NAME/KEY: 

(B) LOCATION: 



CDS 

1. .579 



r^stjeptide 
1. .579 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 167: 

ACA TGC GGC TTC GCC GAC CTC ATG GGA TAG ATC CCG CTT GTA GGC GCC 4 8 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala 
15 10 15 

CCC GTG GGT GGC GTC GCC AGG GCC'CTG GCA CAT GGT GTT AGG GCT GTG 96 
Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 

GAA GAC GGG ATC AAT TAT GCA ACA GGG AAC CTT CCC GGT TGC TCC TTT 144 
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Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 

35 40 45 

TCT ATC TTC CTC TTG GCG CTC CTC TCG TGC CTG ACT GTT CCC ACA TCG 192 

Ser lie Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Thr Ser 
50 55 60 

GCC GTT AAC TAT CGC AAT GCT TCG GGC ATT TAT CAC ATC ACC AAT GAC 24 0 

Ala Val Asn Tyr Arg Asn Ala Ser Gly lie Tyr His lie Thr Asn Asp 
65 70 75 80 

TGC CCG AAT GCA AGC ATA GTG TAG GAG ACC GAA AAT CAC ATC TTA CAC 283 

Cys Pro Asn Ala Ser lie Val Tyr Glu Thr Glu Asn His He Leu His 
85 90 95 

CTC CCA GGG TGC GTA CCC TGT GTG AGG ACT GGG AAC CAG TCG CGG TGT 33 S 

Leu Pro Gly Cys Val Pro Cys Val Arg Thr Gly Asn Gin Ser Arg Cys 
100 105 110 

TGG GTG GCC CTC ACT CCC ACA GTA GCG TCG CCA TAC GCC GGT GCT CCG 3 84 

Trp Val Ala Leu Thr Pro Thr Val Ala Ser Pro Tyr Ala Gly Ala Pro 

113 120 125 

CTT GAG CCC TTG CGG CGT CAT GTG GAC CTG ATG GTA GGT GCT GCC ACC 43 2 

Leu Glu Pro Leu Arg Arg Kis Val Asp Leu Met: Val Gly Ala Ala Thr 
130 135 140 

ATG TGT TCC GCC CTC TAC ATC GGC GAC TTG TGC GGT GGC TTA TTC TTG 43 0 

Met Cys Ser Ala Leu Tyr He Gly Asp Leu Cys Gly Gly Leu Phe Leu 
145 150 155 160 

GTG GGC CAA ATG TTC ACC TTC CAA CCG CGA CGT CAC TGG ACC ACT CAG 52 8 

Val Gly Gin Met Phe Thr Phe Gin Pro Arg Arg His Trp Thr Thr Glr. 

165 170 175 

GAC TGC AAT TGT TCC ATC TAC ACG GGC CAC ATT ACG GGT CAT CGG ATG 576 

Asp Cys Asn Cys Ser He Tyr Thr Gly His He Thr Gly His Arg Met 
lao 135 190 

GCA 579 
Ala 



(2) INFORMATION FOR SEQ ID NO: 163: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 3 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

ixi) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 168: 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala 
15 10 15 



SUBSTITLTTE SHEET (RULE 26) 



WO 94/25601 ' PCT/EP94701323 

217 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 

Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

Ser lie Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Thr Ser 

50 35 60 

Ala Val Asn Tyr Arg Asn Ala Ser Gly lie Tyr His lie Thr Asn Asp 
65 70 75 80 

Cys Pro Asn Ala Ser lie Val Tyr Glu Thr Glu Asn His lie Leu His 
85 90 95 

Leu Pro Gly Cys Val Pro Cys Val Arg Thr Gly Asn Gin Ser Arg Cys 
100 105 110 

Trp Val Ala Leu Thr Pro Thr Val Ala Ser Pro Tyr Ala Gly Ala Pro 
115 120 125 

Leu Glu Pro Leu Arg Arg His Val Asp Leu Met Val Gly Ala Ala Thr 
130 135 140 

Met Cys Ser Ala Leu Tyr lie Gly Asp Leu Cys Gly Gly Leu Phe Leu 
145 150 155 ISO 

Val Gly Gin Met Phe Thr Phe Gin Pro Arg Arg His Trp Thr Thr Gin 
ISS 170 175 

Asp Cys Asn Cys Ser lie Tyr Thr Gly His lie Thr Gly His Arg Met 
130 185 190 

Ala 



(2) INFORMATION FOR SZQ ID NO: 16 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 579 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 



FEATURE : 

(A) NAME/KEY: 

(B) LOCATION: 

FEATURE : 

(A) NAME/KEY: 

(B) LOCATION: 



CDS 

1 . . 579 



mat__peptide 
1. .576 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 9: 

ACA TGC GGC TTC GCC GAC CTC ATG GGA TAG ATC CCG CTT GTA GGC GCC 48 
Thr Cys Gly Phe Ala Asp Leu Mec Gly Tyr He Pro Leu Val Gly Ala 
15 10 15 

CCC GTG GGT GGC GTC GCC AGA GCC CTG GCA CAC GGT GTT AGG GCT GTG 96 
Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 

GAA GAC GGG ATC AAC TAG GCA ACA GGG AAT CTC CCC GGT TGC TCC TTT 144 
Glu Asp Gly He Asn Tyr Ala Thr Gly Asn Lau Pro Gly Cys Ser Phe 
35 40 45 

TCT ATC TTC CTC TTG GCA CTT CTC TCG TGC CTC ACT GTT CCC GCG TCG 192 
Ser He Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 55 SO 

GGC GTT AAC TAT CGC AAT GCT TCG GGC GTT TAT CAC ATC ACC AAC GAC 240 
Gly Val Asn Tyr Arg Asn Ala Ser Gly Val Tyr Hxs He Thr Asn Asp 
65 70 75 80 

TGC CCG AAT GCG AGC ATA GTG TAG GAG ACC GAC AAT CAC ATC TTA CAC 28 3 

Cys Pro Asn Ala Ser He Val Tyr Glu Thr Asp Asn His He Leu His 
85 90 95 

CTC CCA GGG TGC GTA CCC TGT GTG AAG ACC GGG AAC CAG TCG CGG TGT 3 3S 

Leu Pro Gly Cys Val Pro Cys Val Lys Thr Gly Asn Glr. Ser Arg Cys 
100 105 110 

TGG GTG GCC CTC ACT CCC ACA GTG GCG TCG CCT TAG GTC GGT GCT CCG 38 4 

Trp Val Ala Leu Thr Pro Thr Val Ala Ser Pro Tyr Val Gly Ala Pro 
115 120 125 

CTC GAG CCC TTG CGG CGC CAT GTG GAC CTG ATG GTA GGT GCT GCC ACC 432 
Leu Glu Pro Leu Arg Arg His Val Asp Leu Met Val Gly Ala Ala Thr 
130 135 140 

GTG TGC TCC GCC CTC TAC GTC GGC GAC CTG TGC GGT GGC TTA TTC TTG- 43 0 

Val Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Gly Leu Phe Leu - 
145 150 155 ISO 

GTA GGC CAA ATG TTC ACC TTC CAA CCG CGA CGC CAC TGG ACG ACC CAG 523 
Val Gly Gin Met Phe Thr Phe Gin Pro Arg Arg His Trp Thr Thr Gin 
165 170 175 

GAC TGT AAT TGT TCC ATC TAC GCA GGG CAT ATT ACG GGC CAT CGG ATG 5 76 

Asp Cys Asn Cys Ser He Tyr Ala Gly His He Thr Gly His Arg Met 
180 185 190 

GCT 579 
Ala 



(2) INFORMATION FOR SEQ ID NO: 170: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 193 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: procein 

(XI) SEQL'ENCS DESCRIPTION: SEQ 12 NO: 170: 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala 
15 10 15 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 

Glu Asp Gly He Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

Ser He Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 55 60 

Gly Val Asn Tyr Arg Asn Ala Ser Gly Val Tyr His He Thr Asn Asp 
65 70 75 80 

Cys Pro Asn Ala Ser He Val Tyr Glu Thr Asp Asn His He Leu His 
85 90 95 

Leu Pro Gly Cys Val Pro Cys Val Lys Thr Gly Asn Gin Ser Arg Cys 
100 105 110 

Trp Val Ala Leu Thr Pro Thr Val Ala Ser Pro Tyr Val Gly Ala Pro 
115 120 125 

Leu Glu Pro Leu Arg Arg His Val Asp Leu Mec Val Gly Ala Ala Thr 
130 135 140 

Val Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Gly Leu Phe Leu 
145 150 155 160 

Val Gly Gin Met Phe Thr Phe Gin Pro Arg Arg His Trp Thr Thr Gin 
165 170 175 

Asp Cys Asn Cys Ser He Tyr Ala Gly His He Thr Gly His Arg Met 
IBO 185 190 

Ala 



(2) INFORMATION FOR SEQ ID NO: 171: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNES5 : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(ili) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(3) LOCATION: 1 . . 579 

(ix) FEATURE: 

(A) NAME/KEY: mac^pepcide 

(B) LOCATION: 1. .576 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171: 

ACA TGC GGC TTC GCC GAC CTC ATG GGA TAC ATC CCG CTT GTG GGC GCC 4 8 

Thr Cys Gly Phe Ala Asp Leu Mec Gly Tyr lie Pro Leu 'Val Gly Ala 

15 10 IS - 

CCT GTT GGT GGC GTC GCC AGA GCC CTT GCG CAC GGC GTC AGG GCT GTG 96 
Pro Val Gly Gly Val Ala Arg Ala Leu Ala Kis Gly Val Arg Ala Val 

20 ^ 2S 30 

GAA GAC GGG ATT AAC TAT GCL\ ACA GGG AAC CTT CCT GGT TGC TCC TTT 14 4 

Glu Asp Gly lie Asn. Tyr Ala Thr Gly Asr^ Leu Pro Gly Cys Ser Phe 
35 40 45 

TCT ATC TTC CTT CTG GCA CTT CTC TCG TGC CTG ACT GTC CCC GCC TCG 192 
Ser lie Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 55 60 

GCT GTG CAT TAT CAC AAC ACC TCG GGC ATC TAC CAC CTC ACC AAT GAC 24 0 

Ala Val His Tyr His Asn Thr Ser Gly lie Tyr His Leu Thr Asn Asp 
65 70 75 80 

TGC CCT AAC TCT AGC ATA GTC TTT GAG GCA GTC CAT CAC ATC TTG CAC 293 
Cys Pro Asn Ser Ser lie Val Phe Glu Ala Val His His lie Leu His 
85 90 95 

CTT CCA GGA TGC GTC CCT TGT GTA AGA ACT GGG AAC CAG TCT CGG TGC 3 36 

Leu Pro Gly Cys Val Pro Cys Val Arg Thr Gly Asn Gin Ser Arg Cys 
100 105 110 

TGG GTA GCC TTG ACC CCC ACG CTG GCC GCG CCA TAC CTT GGC GCT CCA 3 34 

Trp Val Ala Leu Thr Pro Thr Leu Ala Ala Pro Tyr Leu Gly Ala Pro 
115 120 125 

CTC GAG TCC ATG CGG CGT CAC GTG GAT TTG ATG GTG GGC ACT GCT ACA 432 
Leu Glu Ser Met Arg Arg His Val Asp Leu Met Val Gly Thr Ala Thr 
130 135 140 

TTG TGC TCA GCA CTC TAC GTT GGG GAC CTG TGC GGG GGC ATA TTC CTA 430 
Leu Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Gly lie Phe Leu 
145 150 155 160 

GCG GGC CAG ATG TTC ACC TTC CGG CCC CGC CTC CAT TGG ACC ACC CAG 523 
Ala Gly Gin Met Phe Thr Phe Arg Pro Arg Leu His Trp Thr Thr Gin 
165 170 175 
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GAG TGC AAT TGT TCC ACC TAT CCG GGC CAC ATC ACG GGT CAT AGA ATG 576 
Glu Cys Asn Cys Ser Tlir Tyr Pro Gly His lie Thr Gly His Arg Met 
180 185 130 

GCG 579 
Ala 



(2) INFORMATION FOR SEQ ID NO: 172: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 193 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172: 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala 
IS 10 15 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 

Glu Asp Gly He Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

Ser He Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 55 60 

Ala Val His Tyr His Asn Thr Ser Gly He Tyr His Leu Thr Asn Asp 
65 70 75 SO 

Cys Pro Asn Ser Ser He Val Phe Glu Ala Val His His He Leu Kis 
85 90 95 

Leu Pro Gly Cys Val Pro Cys Val Arg Thr Gly Asn Gin Ser Arg Cys 
100 105 110 

Trp Val Ala Leu Thr Pro Thr Leu Ala Ala Pro Tyr Leu Gly Ala Pro 
115 120 125 

Leu Glu Ser Met Arg Arg His Val Asp Leu Met Val Gly Thr Ala Thr 
130 135 140 

Leu Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Gly He Phe Leu 
145 150 155 160 

Ala Gly Gin Met Phe Thr Phe Arg Pro Arg Leu His Trp Thr Thr Gin 
165 170 175 

Glu Cys Asn Cys Ser Thr Tyr Pro Gly His He Thr Gly His Arg Met 
180 135 190 

Ala 
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(2) INFORMATION FOR SEQ ID NO : 173: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 9 base pa.rs 

(B) TYPE: nucleic acid 

(C) STRANDEDNE3S : Single 

(D) TOPOLOGY: linear 

(ii) P40LECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI- SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..579 

(ix) FEATURE: 

(A) NAME/KEY: mat_pept:ids 

(B) LOCATION: 1..575 



(xi) SEQUENCE DESCRIPTION: SZ^ ID NC : 173: 

ACG TGC GGT TCC GCC GAC CTC ATG GGA TAG ATC CCG CTC GTA GGC GCC 4 8 

Thr Cys Gly Ser Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala 
1 5 ■ io 15 

CCT GTG GGT GGC GTC GCC AGG GCC TTG GCG CAT GGC GTC AGG GCT GTG 96 
Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 

GAG GAC GGG ATA AAC TAT GCA ACA GGG AAC CTT CCT GGT TGC TCT TTT 144 
Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

TCT ATC TTC CTT CTG GCA CTT CTC TCG TGC CTG ACT GTC CCC GCC TCA 192 
Ser lie Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 55 60 

GCT GTG CAT TAT CAC AAC ACC TCG GGC ATC TAT CAC ATC ACT AAT GAC 240 
Ala Val His Tyr His Asn Thr Ser Gly lie Tyr His lie Thr Asn Asp 
65 70 75 80 

TGC CCT AAC TCT AGC ATA GTC TTT GAG GCA GAG CAT CAC ATC TTG CAT 238 
Cys Pro Asn Ser Ser lie Val Phe Glu Ala Glu His His lie Leu His 
85 90 95 

CTT CCA GGA TGC GTC CCC TGT GTG AGA ACT GGG AAC CAG TCA CGA TGC 336 
Leu Pro Gly Cys Val Pro Cys Val 'Arg Thr Gly Asn Gin Ser Arg Cys 
100 105 110 

TGG ATA GCC TTG ACC CCT ACG TTG GCC GCG CCA CAC ATT GGC GCT CCA 3 84 

Trp lie Ala Leu Thr Pro Thr Leu Ala Ala Pro His He Gly Ala Pro 
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115 120 125 

CTT GAG TCC ATG CGA CGT CAT GTG GAT TTG ATG GTA GGC ACT GCC ACA 432 
Leu Glu Ser Met Arg Arg His Val Asp Leu Met Val Gly Thr Ala Thr 
130 135 140 

TTG TGC TCC GCA CTC TAC ATT GGA GAT CTG TGC GGA GGC ATA TTT CTA 4 30 

Leu Cys Ser Ala Leu Tyr lie Glv Asi3 Leu Cvs Glv Gly lie Phe Leu 
I'l^ 150 155 ■ 

GTG GGC CAG ATG TTC AAC TTC AGG CCC CGC CTG CAC TGG ACC ACC CAG 528 

Val Gly Gin Mec Phe Asn Phe Arg Pro Arg Leu His Trp Thr Thr Gin 

165 170 ' 175 

GAG TGC AAT TGT TCC ATC TAT CCA GGC CAC ATC ACG GGT CAC AGA ATG 576 
Glu Cys Asn Cys Ser He Tyr Pro Gly His He Thr Gly His Arg Met 
180 185 2.30 



GCG 
Ala 



579 



(2) INFORMATION FOR S£Q ID NO: 174: 

(i) SEQUENCE OiARACTERISTICS : 

(A) LENGTH: 193 amino ac:Lds 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE T^rPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID ND: 174: 

Thr Cys Gly Ser Ala As? Leu Met Gly Tyr He Pro Leu Val Gly Ala 
15 10 15 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 ' 30 

Glu Asp Gly He Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

Ser He Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 55 60 

Ala Val His Tyr His Asn Thr Ser Gly He Tyr His He Thr Asn Asp 
65 70 75 *80 

Cys Pro Asn Ser Ser He Val Phe Glu Ala Glu Kis His He Leu His 
85 90 95 

Leu Pro Gly Cys Val Pro Cys Val Arg Thr Gly Asn Gin Ser Arg Cys 
100 ^05 110 

Trp He Ala Leu Thr Pro Thr Leu Ala Ala Pro Kis He Gly Ala Pro 
115 120 125 
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Leu Glu Ser Mec Arg Arg His Val Asp Leu Met Val Gly Thr Ala Tlir 
130 135 140 

Leu Cys Ser Ala Leu Tyr lie Gly Asp Leu Cys Gly Gly He Phe Leu 
145 150 155 160 

Val Gly Gin Met: Phe Asn Phe Arg Pro Arg Leu His Trp Thr Thr Gin 
1S5 170 175 

Glu Cys Asn Cys Ser He Tyr Pro Gly His He Thr Gly His Arg Met 
lao 185 190 

Ala 



(2) INFORMATION FOR SEQ ID NO : 17 5: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 579 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: linear 

{11} MOLECULE TYPE: cDNA 
(iii) HYPOTHETIC^: NO 
Ciii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1, .579 

(ix) FEATURE: 

(A) NAME/KEY: man_pepcide 

(B) LOCATION: 1. .576 



(xi) SEQUENCE DESCRIPTION: SEQ H) NO: 175: 

ACG TGC GGC TTT GCC GAC CTC ATG GGA TAC ATC CCG CTC GTG GGC GCC 4 8 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala 
1^5 10 15 

CCT GTG GGT GGC GTC GCC AGG GCC TTG GCA CAT GGT GTC AGG GCC GTG 96 
Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 

GAG GAC GGG ATT AAC TAT GCA ACA GOG AAT CTT CCC GGT TGC TCC TTT 144 
Glu Asp Gly He Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 

35 40 45 

TCT ATC TTC CTT CTA GCA CTT CTC TCG TGC TTG ACT GTC CCG GCC TCG 192 
Ser He Phe Leu Leu Ala Leu Leu"Ser Cys Leu Thr Val Pro Ala Ser 
50 55 60 

GCG CAG CAC TAC CGG AAC ATC TCG GGC ATT TAT CAC GTC ACC AAT GAC 24 0 

Ala Gin His Tyr Arg Asn He Ser Gly He Tyr His Val Thr Asn Asp 
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65 70 75 80 

TGC CCG AAC TCT AGT ATA GTG TAT GAA GCT GAC CAT CAT ATC ATG CAT 28 8 

Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp His His He Mec His 
85 90 95 

CTA CCA GGG TGT GTG CCT TGC GTG AGA ACC GGG AAC ACC TCG CGC TGC 336 
Leu Pro Gly Cys Val Pro Cys Val Arg Thr Gly Asn Thr Ser Arg O/s 
100 105 110 

TGG GTT CCT TTA ACA CCC ACT GTG GCT GCC CCC TAT GTT GGC GCG CCG 3 84 

Trp Val Pro Leu Thr Pro Thr Val Ala Ala Pro Tyr Val Gly Ala Pro 
115 120 125 

CTC GAA TCC ATG CGG CGG CAC GTG GAC TTA ATG GTG GGT GCC GCC ACC 4 32 

Leu Glu Ser Men Arg Arg His Val Asp Leu Met Val Gly Ala Ala Thr 
130 135 140 

GTC TGC TCG GCC CTG TAC ATC GGA GAC CTT TGC GGA GGT GTC TTC CTG ' 4 30 

Val Cys Ser Ala Leu Tyr lie Gly Asp Leu Cys Gly Gly Val Phe Leu 
145 150 155 16G 

GTC GGG CAG ATG TTC ACC TTC CGG CCG CGC CGC CAT TGG ACT ACC CAG 52 8 

Val Gly Gin Met Phe Thr Phe Arg Pro Arg Arg His Trp Thr Thr Gln 
165 170 175 

GAC TGC AAC TGC TCT ATC TAT GAT GGC CAC ATC ACC GGC CAT AGA ATG 576 
Asp Cys Asn Cys Ser He Tyr Asp Gly His He Thr Gly His Arg Met 
180 IBS 190 

GCT 579 
Aj.a 



(2) INFORMATION FOR SEQ ID NO: 176: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 193 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 176: 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala 
1 5 10 15 . 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 

Glu Asp Gly He Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

Ser He Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 55 60 
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Ala Gin His Tyr Arg Asn lie Ser Gly He Tyr His Val Thr Asn Asp 

65 70 75 80 

Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp His His He Met His 

'8 5 90 95 

Leu Pro Gly Cys Val Pro Cys Val Arg Thr Gly Asn Thr Ser Arg Cys 
100 ICS 110 

Trp Val Pro Leu Thr Pro Thr Val Ala Ala Pro Tyr Val Gly Ala Pro 
115 120 125 

Leu Glu Ser Mec Arg Arg His Val Asp Leu Met Val Gly Ala Ala Thr 
130 135 140 

Val Cys Ser Ala Leu Tyr He Gly Asp Leu Cys Gly Gly Val Phe Leu 
145 150 155 160 

Val Gly Gin Met Phe Thr Phe Arg Pro Arg Arg His Trp Thr Thr Gin 
165 170 175 

Asp Cys Asn Cys Ser He Tyr Asp Gly Hxs He Thr Gly His Arg Met 
180 195 190 

Ala 



(2) INFORMATION FOR SEQ ID NO: 177: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 79 base pairs 

(B) TYPE: nucleic acid 

(C) STRA^JDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE T^f PE : cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



FEATURE: 

(A) NAME/KEY: 

(B) LOCATION: 

FEATURE: 

(A) NAME/KEY: 

(B) LOCATION: 



CDS 

1. .579 



mat_peptide 
1, .576 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177: 

ACG TGC GGG TTC GCC GAC CTC ATG GGA TAG ATC CCG CTC GTG GGC GCT 4 8 

Thr Cys Gly Phe Ala Asp Leu Mec "Gly Tyr He Pro Leu Val Gly Ala 
15 10 15 

CCA GTA GGA GGC GTC GCC AGA GCC TTG GCG CAT GGC GTC AGG GCT GTG 96 
Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
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20 25 30 

GAG GAC GGG ATC AAT TAG GCA ACA GGG AAC CTT CCC GGC TGC TCC TTT 144 
Glu Asp Gly lie Asn Tyr Ala Tlir Gly Asn Leu Pro Gly Cys Ser Ptie 
35 40 45 

TCT ATC TTC CTC TTG GTA CTT CTC TCG CGC CTA ACT GTC CCA GCG TCT 132 
Ser He Phe Leu Leu Val Leu Leu Ser Arg Leu Thr Val Pro Ala Ser 
50 55- 60 

GCT CAG CAC TAC CGG AAT GCA TCG GGC ATC TAG CAT GTC ACC AAC GAC 24 0 

Ala Gin His Tyr Arg Asn Ala Ser Gly He Tyr His Val Thr Asn Asp 
65 70 75 30 

TGC CCG AAC TCC AGT ATT GTG TAT GAA GCC GAC CAT CAC ATC ATG CAC 28 3 

Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp His His He Met His 
85 90 95 

CTA CCC GGG TGT GTG CCC TGT GTA AGA ACT GGG AAT GTC TCG CGT TGC ' 336 

Leu Pro Gly Cys Val Pro Cys Val Arg Thr Gly Asn Val Ser Arg Cys 
100 105 110 

TGG ATT CCT TTA ACX CCC ACT GTA GCC GTC CCC TAC CTC GGG GCT CCA 3 34 

Trp He Pro Leu Thr Pro Thr Val Ala Val Pro Tyr Leu Gly Ala Pro 
115 120 125 

CTT ACG TCT GTA CGG CAG CAT GTG GAC CTG ATG GTG GGG GCG GCC ACC 4 32 

Leu Thr Ser Val Arg Gin H:.s Val Asp Leu Met Val Gly Ala Ala Thr 
130 135 140 

TTA TGC TCT GCC CTC TAC ATC GGA GAC CAT TGC GGA GGT GTC TTC TTG 48 0 

Leu Cys Ser Ala Leu Tyr He Gly Asp His Cys Gly Gly Val Phe Leu 
145 150 155 160 

GCA GGG CAG ATG GTC AGT TTC CAA CCC CGG CGT CAT TGG ACT ACC CAG 52 3 

Ala Gly Gin Met Val Ser Phe Gin Pro Arg Arg His Trp Thr Thr Gin 
165 170 175 

GAT TGC AAC TGT TCC ATC TAT GTG GGC CAC ATC ACC GGC CAC AGG ATG 576 
Asp Cys Asn Cys Ser He Tyr Val Gly His He Thr Gly His Arg Met 
180 185 190 



GCC 
Ala 



579 



(2) INFORMATION FOR SEQ ID NO: 173: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 193 ammo acids 

(B) TYPE: am:::-^ acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 178: 
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Thr Cys Gly Phe Ala Asp Leu Mec Gly Tyr He Pro Leu Val Glv Ala 
5 10 J 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Glv Val Arg Ala Val 
20 25 ■ 30 

Glu Asp Gly He Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

Ser He Phe Leu Leu Val Leu Leu Ser Arg Leu Thr Va^ P-o Ala Se- 
50 SS 60 ~ ' * 



Ala Gin His Tyr Arg Asn Ala Ser Gly lie Tyr His Va^ Thr Asn 

70 75 ^ 80 

Cys Pro Asn Ser Ser lie Val Tyr Glu Ala Asp His His lie Mec His 
as 90 95 

Leu Pro Gly Cys Val Pro Cys Val Arg Thr Gly Asn Val Ser Arg Cys 
100 105 

Trp He Pro Leu Thr Pro Thr Val Ala Val Pro Tyr Leu Gly Ala Pro 
lis 120 125 

Leu Thr Ser Val Arg Gin His Val Asd Leu Met: Val Glv Ala A^ a T^- 

135 140 

Leu Cys Ser Ala Leu Tyr He Gly Asp His Cys Gly Glv Val Ph» Leu 

155 ' 160 

Ala Gly Gin Met Val Ser Phe Gin Pro Arg Arg His Trp Thr Thr Gin 
165 170 175 

Asp Cys Asn Cys Ser He Tyr Val Gly His He Thr Glv His Arg Met 
180 185 ' 190 

Ala 



(2) INFORMATION FOR SEQ ID NO: 179: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 579 base pairs 
(3) TYPE: nucleic acid 
<C) STRANDEDNES3 : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

Ciii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 



(ix) KIATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..579 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179: 

ACCTGCGGCT TCGCCGACCT CATGGGATAC ATCCCGCTCG TAGGCGCCCC CGTGGGAGGC 60 

GTCGCCAGAR CTCTGGCGCA TGGCGTCAGG GCTCTGGAAG ACGGGATCAA TTATGCAAC^ 120 

GGGAATCTTC CTGGTTGCTC TTTCTCTATC TCCCTTCTTG AACTTCTCTC GTGCCTGACT 130 

GTTCCCGCCT CAGCCATCCA CTATCGCAAT GCTTCGGACG GTTATTATAT CACGAATGAT 240 

TGCCCGAACT CTAGCATAGT GTATGAAGCC GAGAACCACA TCTTGCACCT TCCGGGGTGT 300 

ATACCCTGTG TGAAGACCGG GAATCAGTCG CGGTGCTGGG TGGCTCTCAC CCCCACGCTG 360 

GCGGCCCCAC ACCTACGTGC TCCGCTTTCG TCCTTACGGG CGCATGTGGA CCTAATGGTG ^ 420 

GGGGCCGCCA CGGCATGCTC CGCTTTTTAC ATTGGAGATC TGTGCGGGGG TGTGTTTTTG 430 

GCGGGCCAAC TGTTCACTAT CCGGCCACGC ATTCATGAAA CCACTCAGGA CTGCAATTGC 54 0 

TCCATCTACT CAGGGCACAT CACGGGTNN^ N^'NNNNNNN 57 9 

(2) INFORMATION FOR SEQ ID NO : 18 0: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 193 amine acids 

(B) TYPE: amino acid 

(C) STRAND EDNES S : Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: proCein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 18 0: 

Thr Cys Gly Phe Ala Asp Leu Mec Gly Tyr lie Pro Leu Val Gly Ala 
15 10 15 



Pro Val Gly Gly Val Ala Arg Xaa 
20 

Glu Asp Gly lie Asn Tyr Ala Thr 
25 40 

Ser lie Ser Leu Leu Glu Leu Leu 
50 55 



Leu Ala His Gly Val Arg Ala Leu 
25 30 

Gly Asn Leu Pro Gly Cys Ser Phe 
45 

Ser Cys Leu Thr Val Pro Ala Ser 
60 



Ala lie His Tyr Arg Asn Ala Ser Asp Gly Tyr Tyr lie Thr Asn Asp 
65 70 75 80 

Cys Pro Asn Ser Ser lie Vai- Tyr Glu Ala Glu Asn His He Leu His 
85 90 95 

Leu Pro Gly Cys He Pro Cys Val Lys Thr Gly Asn Gin Ser Arg Cys 
100 105 110 
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Trp Val Ala Leu Thr Pro Thr Leu Ala Ala Pro His Leu Arg Ala Pro 
115 120 125 

Leu Ser Ser Leu Arg Ala His Val Asp Leu Met Val Gly Ala Ala Thr 
130 135 140 

Ala Cys Ser Ala Phe Tyr lis Gly Asp Leu Cys Gly Gly Val Phe Leu 
145 150 155 ISO 

Ala Gly Gin Leu Phe Thr lie Arg Pro Arg lie His Glu Thr Thr Gin 
165 170 175 

Asp Cys Asn Cys Ser lie Tyr Ser Gly His lie Thr Gly Xaa Xaa Xaa 
lao 185 190 

Xaa 



(2) INFORMATION FOR SEQ ID NO : 181: 

(il SEQUHNCZ CHARACTERISTICS: 

(A) LENGTH: 57 9 base pairs 

(B) TYPE: nucleic acid 
iC) STRANDEDKSSS : single 
(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iii) Aim-SENSE: NO 



(ix) FEATL^H: 

(A) NAME/KEY: CDS 
(3) LOCATION: 1..578 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181: 

GCGTGCGGCT TCGCCGATCT CATGGGATAC ATCCCGCTCG TAGGCGCCCC CGTGGGTGGC 6 0 

GTCGCCAGAG CCCTGGCGCA CGGTGTTAGG GCTGTGGAGG ACGGGATTAA CTACGCAACA 120 

GGGAATCTTC CTGGTTGCTC TTTCTCTATC TNCCTTCTGG CACTTCTCTC GTGCCTGACT 180 

GTCCCGGCCT CGGCTCAGCA CTACCGGAAT GTCTCGGGCA TCTACCACGT CACCAATGAT 240 

TGCCCGAATT CCAGCATAGT GTATGAAGCC GATCACCACA TCATGCACTT ACCAGGGTGC 3 00 

ATACCCTGCG TGAGGACCGG GAACGTTTCC3- CGCTGCTGGG TATCTCTGAC ACCTACTGTG 360 

GCTGCTCCCT ACCTCGGGGC TCCGCTTACG TCGCTACGGC GGCATGTGGA TTTGATGGTG 42 0 

GGTGCAGCCA CCCTTTGCTC TGCCCTCTAC GTCGGAGACC TCTGTGGAGG TGTCTTCCTA 480 
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GTGGGACAGA TGTTCACCTT CCAGCCGCGC CGCCACTGGA CCACTCAGGA CTGCAACTGC 540 
TCCATTTACG TCGGCCACAT CACAGGCCAC AGAATGGCT 579 

(2) INFORMATION FOR SEQ ID NO: 182: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 193 amino acids 

(B) TYPE: amino acid 

<C) STRAOTEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182: 

Ala Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala 
15 10 15 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala Kis Gly Val Arg Ala Val 
20 25 30 

Glu Asp Gly He Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

Ser He Xaa Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 55 60 

Ala Gin His Tyr Arg Asn Val Ser Gly He Tyr His Val Thr Asn Asp 
65 70 75 80 

Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp His His He Met His 
85 90 95 

Leu Pro Gly Cys He Pro Cys Val Arg Thr Gly Asn Val Ser Arg Cys 
100 105 110 

Trp Val Ser Leu Thr Pro Thr Val Ala Ala Pro Tyr Leu Gly Ala Pro 
115 120 125 

Leu Thr Ser Leu Arg Arg His Val Asp Leu Met Val Gly Ala Ala Thr 
130 13S 140 

Leu Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Gly Val Phe Leu 
145 150 155 ISO 

Val Gly Gin Met Phe Thr Phe Gin Pro Arg Arg His Trp Thr Thr Gin 
165 170 175 

Asp Cys Asn Cys Ser He Tyr* Val Gly His He Thr Gly His Arg Met 
180 185 190 

Ala 
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(2) INFORMATION FOR SEQ ID KO : 183: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 579 base pairs 

(B) TYPE: nucleic acid 

(C) STRANEEDNESo : sxngle 

(D) TOPOLOGY: li.near 

(ii) MOLECULE TYPE: cDNA 
(iri) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1 . . 579 

(ix) FEATURE: 

(A) NAME/KEY: mac_pe?tide 

(B) LOCATION: 1. .579 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 

ACC TGC GGC TTT GCC GAC CTC ATG GGA TAC ATC CCG CTC GTA GGC GCC 4 8 

Thr Cys Gly Phe Ala Asp Leu Mec Gly Tyr He Pro Leu Val Gly Ala 
15 10 15 

CCT GTG GGT GGC GTC GCC AGG GCC CTA GAA CAC GGT GTT AGG GCT GTG 96 
Pro Val Gly Gly Val Ala Arg Ala Leu Glu His Gly Val Arg Ala Val 
20 25 30 

GAG GAC GGT ATT AAT TAT GCA ACA GGG AAT CTC CCC GGT TGC TCT TTT 14 4 

Glu Asp Gly He Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

TCT ATC TCC CTC TTG GCA CTT CTT TCG TGC CTG ACT GTT CCC ACC TCA 192 
Ser He Ser Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Thr Ser 
50 55 SO 

GCC GTC AAC TAT CGC AAC GCC TCG GGC GTC TAT CAT ATC ACC AAT GAC 240 
Ala Val Asti Tyr Arg Asn Ala Ser Gly Val Tyr His He Thr Asn Asp 
65 70 75 ao 

TGC CCG AAT TCG AGO ATA GTG TAC GAG GCT GAC TAC CAC ATC CTA CAC 2 83 

Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp Tyr His He Leu His 
85 90 95 

CTC CCT GGG TGC TTA CCC TGC GTG AGG GTT GGG AAT CAG TCA CGC TGC 33o 
Leu Pro Gly Cys Leu Pro Cys Val "Arg Val Gly Asn Gin Ser Arg Cys 
100 105 110 

TGG GTG GCC CTT ACT CCC ACC GTG GCG GCG CCT TAC GTT GGT GCT CCG 3 84 

Trp Val Ala Leu Thr Pro Thr Val Ala Ala Pro Tyr Val Gly Ala Pro 
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115 120 125 







TCP 


CTC 


CGG 


AGT 


CAT 


GTG 


GAT 


CTG 


ATG GTA GGT GCT GCT 


ACT 


Lieu 


(jlU 




Le\i 


Arg 


Ser 


His 


Val 


Asp 


Leu 


Met Val Gly Ala Ala 


Thr 




X J u 










13 5 








140 










GCT 


CTT 


TAC 


ATC 


GGG 


GAC 


CTG 


TGC GGT GGC GTA TTT 


TTG 


Val 


Cys 


Ser 


Ala 


Leu 


Tyr 


He 


Gly Asp 


Leu 


C/s Gly Gly Val Phe 


Leu 


145 










150 










155 


160 


GTT 


GGT 


CAG 


ATG 


TTT 


TCT 


TTC 


CAG 


CCG 


CGA 


CGC CAC TGG ACC ACG 


CAG 


Val 


Gly 


Gin 


Met 


Phe 


Ser 


Phe 


Gin 


Pro 


Arg 


Arg His Trp Thr Thr 


Gin 










165 










170 


175 




GAC 


TGC 


AAT 


TGT 


TCT 


ATC 


TAC 


GCG 


GGG 


CAC 


GTT ACG GGC CAC AGG 


ATG 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Ala 


Gly His 


Val Thr Gly His Arg 


Met 








180 










las 




190 





GCA 
Ala 



(2) INFORMATION FOR SEQ ID NC : 134: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 193 ammo acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 184: 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala 
1 5 ^ 10 15 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 

Glu Asp Gly He Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

Ser He Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Thr Ser 
50 55 60 

Ala Val Asn Tyr Arg Asn Ala Ser Gly He Tyr His He Thr Asn Asp 
65 70 75 80 

Cys Pro Asn Ala Ser He Val Tyr Glu Thr Glu Asn His He Leu His 
85 90 95 

Leu Pro Gly Cys Val Pro Cys Val Arg Thr Gly Asn Gin Ser Arg Cys 
100 105 110 

Trp Val Ala Leu Thr Pro Thr Val Ala Ser Pro Tyr Ala Gly Ala Pro 
115 120 125 

Leu Glu Pro Leu Arg Arg Hxs Val Asp Leu Met Val Gly Ala Ala Thr 
130 135 140 



432 



480 



528 



576 



579 
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Met Cys Ser Ala Leu Tyr lie Gly Asp Leu Cys Gly Gly Leu Phe Leu 
150 1S5 ISO 



Val Gly Gin Mec Phe Thr Phe Gin Pro Arg Arg His Trp Thr Thr Glr 
165 170 



173 



Asp Cys Asn Cys Ser 



le Tyr Thr Gly His He Thr Gly His Arg Mec 
180 iss 



190 



Ala 



(2) INFORMATION FOR SEQ ID NO: 192: 

(i) SEQUENCH CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: proca:Ln 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1S2 : 

Ala Cys Gly Phe Ala Asp Leu Mer Glv Tyr He Pro Leu Va^ G^ v Al 
15 10 Is 



a 



Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Glv Val Arg Ala Va^ 
20 25 * 30 

Glu Asp Gly He Asn Tyr Ala Thr Gly Asn Leu Pro Gly O/s Ser Phe 
35 40 45 

Ser He Ser Phe Trp His Phe Ser Arg Ala * Leu Ser Arg Pro Arcr 
50 55 6G 

Leu Ser Thr Thr Gly Met Ser Arg Ala Ser Thr Thr Ser Pro Met He 
^5 70 75 80 

Ala Arg He Pro Ala * Cys Met Lys Pro He Thr Thr Ser Cys Thr 
85 90 55 

Tyr Gin Gly Ala Tyr Pro Ala * Gly Pro Glv Thr Phe Arg Ala Ala 
100 105 110 

Gly Tyr Leu * His Leu Leu Trp Leu Leu Pro Thr Ser Gly Leu Arg 
115 120 125 

Leu Arg Arg Tyr Gly Gly Met Trp He * Trp Tro Val Gin Pro P-o 

135 140 

Phe Ala Leu Pro Ser Thr Ser Glu Thr Ser Val Glu Val Ser Se- * 

150 155 ' ' 

Trp Asp Arg Cys Ser Pro Ser Ser Arg Ala Ala Thr Gly Pro Leu Arg 
165 170 175 



Thr Ala Thr Ala Pro Phe Thr Ser Ala Thr 



Ser Gin Ala Thr Glu Tro 
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130 135 190 



(2) INFORMATION FOR SEQ ID NO : 13 5: 

(i) SEQUENCE CHARACTZRI3TICS : 

(A) LENGTH: 579 base pairs 

(B) TYPE: nucleic acid 

(C) STHAOTEDNE3S: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..579 

{ ix ) FEATURE : 

(A) NAME/KEY: mac^peptide 

(B) LOCATION: 1..576 



(Xi} SEQUENCE DESCRIPTION: SEQ ID NO: 135: 

ACT TGC GGC TTT GCC GAC CTC ATG GGA TAG ATC CCG CTC GTA GGC GCC 4 8 

Thr Cys Gly Phe Ala Asp Leu Mec Gly Tyr lie Pro Leu Val Gly Ala 
15 10 15 

CCC GTG GGT GGC GTC GCC AGA GCC CTG GAA CAT GGT GTT AGG GCT GTG 96 
Pro Val Gly Gly Val Ala Arg Ala Leu Glu His Gly Val Arg Ala Val 
20 25 30 

GAG GAC GGC ATC AAT TAT GCA ACA GGG AAT CTC CCC GGT TGC TCT TTC 14 4 

Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

TCT ATC TAG CTC TTG GCA CTT CTC TCG TGC CTG ACT GTT CCC ACC TCG 192 
Ser lie Tyr Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Thr Ser 
50 55 60 

GCC ATC CAC TAT CGC AAT GCC TCG GGC GTC TAC CAC GTC ACC AAT GAC 240 
Ala He His Tyr Arg Asn Ala Ser Gly Val Tyr His Val Thr Asn Asp 
65 70 75 80 

TGC CCG AAC TCG AGC ATA GTG TAC GAG GCC GAC CAC CAC ATC CTA CAC 288 
Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp His His He Leu His 
85 90 95 

CTT CCA GGG TGC TTA CCC TGT GTG" AGG GTT GGG AAT CAG TCA CGT TGT 3 36 

Leu Pro Gly Cys Leu Pro Cys Val Arg Vai Gly Asn Gin Ser Arg Cys 
100 105 110 

TGG GTG GCC CTC TCT CCC ACC GTG GCG GCG CCT TAC ATC GGT GCT CCA 3 84 
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Trp Val Ala Leu Ser Pro Thr Val Ala Ala Pro Tyr He Gly Ala Pro 
115 120 125 

GTT GAA TCC TTC CGG AGA CAC GTG GAC ATG ATG GTG GGC GCT GCT ACT 432 
Val Glu Ser Phe Arg Arg His Val Asp Met Mec Val Gly Ala Ala Thr 
130 135 140 

GTG TGC TCC GCT CTC TAT ATT GGG GAC TTG TGT GGT GGC GTA TTC TTG 48 0 

Val Cys Ser Ala Leu Tyr He Gly Asp Leu Cys Gly Gly Val Phe Leu 
145 150 155 ISO 

GTT GGT CAG ATG TTT TCT TTC CGG CCA CGA CGC CAC TGG ACT ACG CAG 528 
Val Gly Gin Mec Phe Ser Phe Arg Pro Arg Arg His Trp Thr Thr Gin 
155 170 175 

GAC TGC AAT TGT TCC ATC TAC GCG GGG CAC ATC ACT GGC CAC GGA ATG 5 7S 

Asp Cys Asn Cys Ser lie Tyr Ala Gly His lie Thr Gly His Gly Met 
180 185 190 

GCA 579 
Ala 



(2) INFORMATION FOR SEQ ID NO: 196: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 19 3 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186: 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala 
15 10 15 

Pro Val Gly Gly Val Ala Arg Ala Leu Glu His Gly Val Arg Ala Val 
20 25 30 

Glu Asp Gly He Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

Ser He Tyr Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Thr Ser 

50 55 60 

Ala He His Tyr Arg Asn Ala Ser Gly Val Tyr His Val Thr Asn Asp 

65 70 75 80 

Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp His His He Leu His 
85 90 95 

Leu Pro Gly Cys Leu Pro Cys Val Arg Val Gly Asn Gin Ser Arg Cys 
100 -105 110 

Trp Val Ala Leu Ser Pro Thr Val Ala Ala Pro T>'r He Gly Ala Pro 
115 120 125 
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Val Glu Ser Phe Arg 
130 

Val Cys Ser Ala Leu 
14 5 

Val Gly Gin Mec Phe 
165 

Asp Cys Asn Cys Ser 
lao 

Ala 



Arg His Val Asp MeC Met 
135 

Tyr He GXy Asp Leu Cys 

ISO 155 

Ser Phe Arg Pro Arg Arg 
170 

He Tyr Ala Gly His He 
135 



Val Gly Ala Ala Thr 
140 

Gly Gly Val Phe Leu 
160 

His Trp Thr Thr Gin 
175 

Thr Gly His Gly Mec 
190 



(2) INFORMATION FOR SEQ ID NO : 187: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 579 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1 . . 579 

(ix) FEATURE: 

(A) NAME/KEY: matjeptide 

(B) LOCATION: 1. .576 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 

ACT TGC GGC TTT GCC GAC CTC ATG GGA TAC ATC CCG CTC GTA GGC GCC 4 8 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala 
15 10 15 

CCT GTG GGT GGC GTC GCC AGG GCC CTG GCA CAC GGT GTT AGG GCT GTG 96 
Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 

GAG GAC GGG ATC AAT TAT GCG ACA GGG AAT CTT CCC GGT TGC TCT TTC 144 
Glu Asp Gly He Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

TCT ATC TTC CTC TTG GCA CTT CTT "TCG TGC CTG ACT GTT CCC ACC TCG 192 
Ser He Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Thr Ser 
50 55 60 

GCC GTC AAC TAT CGC AAT GCC TCG GGC ATC TAT CAC ATC ACC AAT GAC 24 0 
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Ala Val Asn Tyr Arg Asn Ala Ser Gly lie Tyr His lie Thr Asn Asp 
65 70 75 80 

TGC CCG AAC TCG AGC ATA GTG TAG GAG ACC GAG CAC CAC ATC CTA CAC 283 
Cys Pro Asn Ser Ser He Val Tyr Glu Thr Glu His His He Leu Hxs 
as 90 95 



CTC CCA GGG TGT TTA CCC TGC GTG AGG GTT GGG AAT CAG TCA CGC TGC 336 
Leu Pro Gly Cys Leu ?ro Cys Val Arg Val Gly Asn Gin Ser Arg Cys 
100 105 110 



TGG GTG GCC CTC ACT CCC ACC GTG GC3 GCG CCT TAC ATC GGC GCT CCG 38 4 

Trp Val Ala Leu Thr Pro Thr Val Ala Ala Pro Tyr He Gly Ala Pro 
115 120 125 



CTT GAA TCC CTC CGG AGT CAT GTG GAT CTG ATG GTA GGT GCC GCT ACT 432 
Leu Glu Ser Leu Arg Ser His Val Asp Leu Met Val Gly Ala Ala Thr 
130 135 140 

GCG TGC TCC GCT CTT TAC ATC GGA GAC CTG TGC GGT GGC GTA TTT TTG 4 SO 

Ala Cys Ser Ala Leu Tyr lie Gly Asp Leu Cys Gly Gly Val Phe Leu 
145 150 155 160 



GTT GGT CAG ATG TTC TCT TTC CAG CCG CGG CGC CAC TGG ACT ACG CAG 52 3 

Val Gly Gin Met: Phe Ser Phe Gin Pro Arg Arg His Trp Thr Thr Gin 
165 170 175 



GAC TGC AAT TGT TCC ATC TAC GCG GGG CAC GTT ACG GGC CAC AC-G ATG 576 
Asp Cys Asn Cys Ser He Tyr Ala Gly His Val Thr Gly His Arg Met 
130 185 190 

GCA 579 
Ala 



(2) INFORMATION FOR SEQ ID NO: 188: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 193 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 188: 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala 
15 10 15 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 

Glu Asp Gly He Asn Tyr Ala Thr -Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

Ser He Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Thr Ser 
50 55 60 
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Ala Val Asn Tyr Arg Asn Ala Ser Gly He Tyr His He Thr Asn 
65 70 75 



Asp 
80 



Cys Pro Asn Ser Ser He Val Tyr Glu Thr Glu His His He Leu 
85 90 95 



His 



Leu Pro Gly Cys Leu Pro Cys Val Arg Val Gly Asn Gin Ser Arg 
100 105 110 



Cys 



Trp Val Ala Leu Thr Pro Thr Val Ala Ala Pro Tyr He Gly Ala 
lis 120 125 



Pro 



Leu Glu Ser Leu Arg Ser His Val Asp Leu Met Val Gly Ala Ala 
130 135 140 



Thr 



Ala Cys Ser Ala Leu Tyr He Gly Asp Leu Cys Gly Gly Val Phe 
145 150 155 



Leu 



160 



Val Gly Gin Met Phe Ser Phe Gin Pro Arg Arg His Trp Thr Thr Gin 
1S5 170 * 175 

Asp Cys Asn Cys Ser He Tyr Ala Gly His Val Thr Gly His Arg Met 
130 185 190 

Ala 



(2) INFORMATION FOR SEQ ID NO : 18 9: 

(i) SEQU^CZ CHAHACTZRISTICS-: 

(A) LENGTH: 57 9 base pairs 

(B) TYPE: nucleic acid 

(C) ST3UVNDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1. .579 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 1. .576 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18 9: 

ACQ TGC GGC TTC GCC GAC CTC ATG GGA TAC ATC CCG CTC GTG GGC GCC 4 8 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala 
15 10 15 
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CCC GTT GGG GGC GTC GCC AGG GCC CTG GCG CAT GGC GTC AGG GCT GTG 96 

Pro Val Gly Gly Vai Ala Arg Ala lieu Ala His Gly Val Arg Ala Val 

20 25 ^ 30 

GAG GAC GGG ATT AAC TAT GCG ACA GGG AAT CTT CCC GGT TGC TCT TTC 144 

Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cvs Ser Phe 

35 40 45 



TCT ATC TTC CTC CTG GCA CTT CTT TCG TGC CTC ACT GTC CCA GCG TCA 
3er lie Phe Leu Leu Ala Leu Leu Ser Cys Leu Tiir Val Pro Ala Ser 
50 55 60 



L92 



GCT GAG CAC TAG CGG AAT GCT TCG GGC ATC TAT CAC ATC ACC AAT GAC 24 0 

Ala Glu His Tyr Arg Asa Ala Ser Gly lie Tyr His lie Thr Asn Asp 
SS 70 75 80 

TGT CCG AAT TCC AGC GTA GTC TAT GAA ACT GAC CAC CAT ATA TTG CAC 288 

Cys Pro Asn Ser Ser Val Val Tyr Glu Thr Asp His His lie Leu His 
85 90 95 

TTG CCG GGG TGC GTA CCC TGC GTG AGG GCC GGG AAC GTG TCT CGT TGC 336 

Leu Pro Gly Cys Val Pre Cys Val Arg Ala Gly Asn Val Ser Arg Cys 

100 105 lie 

TGG ACG CCG GTA ACA CCT ACG GTG GCT GCC GTA TCC ATG GAC GCT CCG 3 84 

Trp Thr Pro Val Thr Pro Thr Val Ala Ala Val Ser Me- Asp Ala Pro 
115 120 125 

CTC GAG TCC TTC CGG CGG CAT GTG GAC CTA ATG GTA GGT GCG GCC ACC 432 

Leu Glu Ser Phe Arg Arg His Val As? Leu Met Val Gly Ala Ala Thr 
130 135 140 

GTG TGT TCT GTC CTC TAT GTT GGA GAC CTC TGT GGA GGT GCT TTC CTA 430 

Val Cys Ser Val Leu Tyr Val Gly Asp Leu Cys Gly Gly Ala Phe Leu 
145 150 155 IqO 

GTG GGG CAG ATG TTC ACC TTC CAG CCG CGT CGC CAC TGG ACC ACG CAG 52 8 

Val Gly Gin Met Phe Thr Phe Gin Pro Arg Arg His Trp Thr Thr Gin 
165 170 175 

GAT TGT AAT TGC TCC ATC TAT ACT GGC CAT ATC ACC GGC CAC AGG ATG 57 6 

Asp Cys Asn Cys Ser lie Tyr Thr Gly His lie Thr Gly His Arg Met 

180 185 190 



GCG 
Ala 



579 



(2) INFORMATION FOR SEQ ID NO: 190: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 193 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

Cii) MOLECULE l"ir?E: protein 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 190: 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala 
1 S 10 15 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 

Glu Asp Gly He Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

Ser He Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 55 60 

Ala Glu His Tyr Arg Asa Ala Ser Gly He Tyr His He Thr Asn Asp 
65 70 75 aO 

Cys Pro Asn Ser Ser Val Val Tyr Glu Thr Asp His His He Leu His 
85 90 95 

Leu Pro Gly Cys Val Pro Cys Val Arg Ala Gly Asn Val Ser Arg Cys 
100 105 110 

Trp Thr Pro Val Thr Pro Thr Val Ala Ala Val Ser Met Asp Ala Pro 
115 120 125 

Leu Glu Ser Phe Arg Arg His Val Asp Leu Met Val Gly Ala Ala Thr 
130 135 140 

Val Cys Ser Val Leu Tyr Val Gly Asp Leu Cys Gly Gly Ala Phe Leu 
145 150 155 160 

Val Gly Gin Met Phe Thr Phe Gin Pro Arg Arg His Trp Thr Thr Gin 
165 170 175 

Asp Cys Asn Cys Ser He Tyr Thr Gly His He Thr Gly His Arg Met 
180 185 190 



Ala 



(2) INFORMATION FOR SEQ ID NO: 191: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 289 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 
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(B) LOCATION: 1. .289 



(ix) FEATURE: 

(A) NAME/KEY: matjpeptide 

(B) LOCATION: 1..2a6 



(xi) ScCUHNCS DESC?.;PTION: SHQ ID NO: 191: 

ATG AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC AAA CGT AAC ACC AAC 

Mec Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 

- 5 10 15 

CGC CGC CCC ATG GAC GTT AAG TTC CCG GGC GGT GGC CAG ATC GTT GGT 

Arg Arg Pro Met Asp Val Lys Phe Pro Gly Gly Gly Gin He Val Gly 

20 25 30 " " 

GGA GTT TAC TTG TTG CCG CGC AGG GGC CCC AGG TTG GGT GTG CGC GCG 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 

35 40 45 



48 



56 



144 



ACT AGG AAG ACT TCG GAG CGG TCG CAA CCT CGT GGG AGA CGT CAG CCT 
Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Glv Ara Arc Gin Pro 
50 55 60 ' " 

ATC CCC AAG GCA CGT CGA TCT GAG GGA AGG TCC TGG GCT CAG CCC GGG 
He Pro Lys Ala Arg Arg Ser Glu Gly Arg Ser Tr? Ala Gin Pro Gly 

70 75 80 

TAC CCA TGG CCT CTT TAC GGT AAT GAG GGT TGT GGG TGG GCA GGA TGG G 
Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
35 90 95 



240 



289 



(2) INFORMATION FOR SEQ ID NO: 192: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 192: 



Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 

1 5 10 ' 15 

Arg Arg Pro Met Asp Val Lys Phe Pro Gly Gly Gly Gin He Val Gly 

20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 

35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 

50 55 " SO 

He Pro Lys Ala Arg Arg Ser Glu Gly Arg Ser Trp Ala Gin Pro Gly 
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65 70 75 30 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
85 90 95 



(2) INFORMATICN FOR SZQ ZZ NO: 193: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 98 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/ KEY: COS 

(3) LOCATION: 1..4 9a 

{ ix) FEATURE : 

(A) NAME/KEY: mat_Depcide 

(B) LOCATION: 1..495 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 193: 

ATG AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC AAA COT AAC ACC AAC 4 3 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
15 10 15 

CGC CGC CCT ATG GAC GTA AAG TTC CCG GGC GOT GGA CAG ATC GTT GGC 96 
Arg Arg Pro Met Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 

GGA GTT TAC TTG TTG CCG CGC AGG GGC CCC CGG TTG GGT GTG CGC GCG 144 
Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

ACT CGG AAG ACT TCG GAG CGG TCG CAA CCT CGT GGC AGG CGT CAA CCT 192 
Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

ATC CCC AAG GCG CGC CGG TCC GAG GGC AGG TCC TGG GCG CAA GCC GGG 240 
lie Pro Lys Ala Arg Arg Ser Glu Gly Arg Ser Trp Ala Gin Ala Gly 
65 70 75 80 

TAC CCC TGG CCC CTC TAT GGC AAT GAG GGC TGT GGG TGG GCA GGG TGG 28 3 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
85 90 95 

CTC CTG TCT CCT CGC GGC TCT CGG CCA TCT TGG GGC CCA AAT GAT CCC 336 
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Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser 
100 105 

CGG CGG AGA TCG CGC AAT CTG GGT AAG GTC 
Arg Arg Arg Ser Arg Asn Leu Gly Lys Val 
lis 120 

GGC rrc Gcc gac ctc atg gga tag atc ccg 

Gly Phe Ala Asp Leu Mec Gly Tyr lie Pro 
130 135 

GGG GGC GTC GCC AGG GCC CTG GCG CAT GGC 
Gly Gly Val Ala Arg Ala Leu Ala H13 Gly 
145 150 

GGG ATT AAC TAT CGA CAG 
Gly He Asn Tyr Arg Gin 
165 



cil !&!! % ^ ^3 3 u n :i B y y 
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Trp Gly Pro Asn Asp Pro 
110 

ATC GAT ACC CTG ACG TGC 3 84 

He Asp Thr Leu Thr Cys 
125 

CTC GTG GGC GCC CCC GTC 432 
Leu Val Gly Ala Pro Val 
140 

GTC AGG GCT GTG GAG GAC 43 0 

Val Arg Ala Val Glu Asp 
155 160 

498 



(2) INFORMATION FOR SEQ ID NO: 194: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 6 ammo acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(xi) SEQL'ENCE DESCRIPTION: SEQ ID NO: 194: 



Met Ser Thr Asn 
1 

Arg Arg Pro Met 
20 

Gly Val Tyr Leu 
35 

Thr Arg Lys Thr 
50 



Pro Lys Pro Gin 
5 

Asp Val Lys Phe 



Leu Pro Arg Arg 
40 

Ser Glu Arg Ser 
55 



Arg Lys Thr Lys 
10 

Pro Gly Gly Gly 
25 

Gly Pro Arg Leu 



Gin Pro Arg Gly 
60 



Arg Asn Thr Asn 
15 

Gin He Val Gly 
30 

Gly Val Arg Ala 
45 

Arg Arg Gin Pro 



He Pro Lys Ala Arg Arg Ser Glu Gly Arg Ser Trp Ala Gin Ala Gly 
65 70 75 go 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
85 90 95 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Asn Asp Pro 
100 105 110 

Arg Arg Arg Ser Arg Asn Leu Gly Lys Val He Asp Thr Leu Thr Cys 
115 120 125 



Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala Pro Val 
130 135 140 
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Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val Glu Asp 

145 150 iss ISO 

Gly lie Asn Tyr Arg Gin 
1S5 



(2) INrORMATION FOR SZQ ID NO: 195: 

(i) SEQUENCE CTARACTERISTICS : 

(A) LENGTH: 579 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: l:Lnear 

(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 
(iil) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(3) LOCATION: 1 . . 579 

(ix) FEATURE: 

(A) NAME/KEY: mac_peptide 

(B) LOCATION: 1 . . 576 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 195: 

ACG TGC GGA TTC GCC GAC CTC GTG GGG TAC ATC CCG CTC GTA GGC GGC 4 3 

Thr Cys Gly Phe Ala Asp Leu Val Gly Tyr lis Pro Leu Val Gly Gly 
15 10 15 

CCC GTT GGG GGC GTC GCA AGG GCT CTC GCA CAT GGT GTG AGG GTT CTT 96 
Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu 
20 25 30 

GAG GAC GGG GTG AAT TAT GCA ACA GGG AAT CTG CCT GGT TGC TCT TTC 144 
Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

TCT ATC TTC ATT CTT GCA CTT CTC TCG TGC CTC ACT GTC CCG GCC TCT 192 
Ser lie Phe He Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 55 60 

GCA GTT CCC TAC CGA AAT GCC TCT GGG ATC TAT CAT GTC ACC AAT GAT 24 0 

Ala Val Pro Tyr Arg Asn Ala Ser Gly He Tyr His Val Thr Asn Asp 
65 70 75 80 

TGC CCA AAC TCT TCC ATA GTC TAT - GAG GCA GAT GAT CTG ATC CTA CAC 2 88 

Cys Pro Asn Ser Ser lie Val Tyr Glu Ala Asp Asp Leu He Leu Kis 
85 90 95 

GCA CCT GGC TGC GTG CCT TGT GTC AGG AAA GAT AAT GTG AGT AGG TGC 3 36 
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Ala 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Lys 


Asp Asn Val Ser 


Arg 


Cys 










100 










105 








110 












ATT 


ACC 


ccc 


ACG 


CTG 


TCA 


GCC 


CCG 


AGC 


TTC GGA 


GCA 


GTC 


384 


Trp 


Val 


Gin 


He 


Thr 


Pro 


Thr 


Leu 


Ser 


Ala 


Pro 


Ser 


Phe Gly Ala Val 








115 










120 










125 








ACCi 


vav. 1 




CTT 






GCw 


GTT 


GAT 


TAC 


TTG 


GTG 


GGA GGG 


GCT 


GCC 


432 


Thr 


Ala 


Pro 


Leu 


Arg 


Arg 


Ala 


Val 


Asp 


Tyr 


Leu 


Val 


Gly Gly 


Ala 


Ala 






13 0 










135 










140 








CTC 


TGC 


TCC 


GCG 


TTA 


TAG 


GTT 


GGA 


GAC 


GCG 


TGT 


GGG 


GCA CTA 


TTT 


TTG 


430 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 


Ala 


Cys 


Gly Ala Leu 


Phe 


Leu 




145 










150 










155 








ISO 




GTA 


GGC 


CAA 


ATG 


TTC 


ACC 


TAT 


AGG 


CCT 


CGC 


CAG 


CAT 


GCT ACG 


GTG 


CAG 


523 


Val 


Gly 


Gin 


Mec 


Phe 


Thr 


Tyr 


Arg 


Pro 


Arg 


Gin 


His 


Ala Thr 


Val 


Gin 












165 










170 








175 






GAC 


TGC 


AAC 


TGT 


TCC 


ATC 


TAC 


AGT 


GGC 


CAC 


GTC 


ACC 


GGC CAT 


CAG 


ATG 


57S 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Ser 


Gly 


His 


Val 


Thr 


Gly His 


Gin 


Met 










180 










185 








190 









Ala 



(2) INFORMATION FOR SHQ ID NO: 196: 

(i) SSQUENCE CHARACTERISTICS; 
(A) LENGTH: 193 amino acids 
(3) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 196: 

Thr Cys Gly Phe Ala Asp Leu Val Gly Tyr He Pro Leu Val Gly Gly 
15 10 15 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu 
20 25 30 

Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

Ser He Phe He Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 55 60 

Ala Val Pro Tyr Arg Asn Ala Ser Gly He Tyr His Val Thr Asn Asp 
65 70 75 80 

Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp Asp Leu He Leu His 
85 90 95 

Ala Pro Gly Cys Val Pro Cys Val Arg Lys Asp Asn Val Ser Arg Cys 
100 105 110 
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Trp Val Gin lie Thr Pro Thr Leu 
lis 120 

Thr Ala Pro Leu Arg Arg Ala Val 
130 135 

Leu Cys Ser Ala Leu T-^-r Val Gly 
145 150 

Val Gly Gin Mec Phe Thr Tyr Arg 
165 

Asp Cys Asn Cys Ser lie Tyr Ser 
180 

Ala 



Ser Ala Pro Ser Phe Gly Ala Val 
125 

Asp Tyr Leu Val Gly Gly Ala Ala 
140 

AsTD Ala Cys Gly Ala Leu Phe Leu 
155 160 

Pro Arg Gin His Ala Thr Val Gin 
170 175 

Gly His Val Thr Gly His Gin Me- 
185 190 



(2) INFORMATION FOR SEQ ID NO: 197: 

(i) SEQUENCE CI-iARACTERISTICS : 

(A) LENGTH: 579 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



FEATURE : 

(A) NAME/KEY: 

(B) LOCATION: 

FEATURE: 
(A) NAME/KEY: 
{B) LOCATION: 



CDS 

1. .579 



mat_peptide 
1. -576 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 197: 

ACT TGC GGC TTT GCC GAC CTC ATG GGA TAC ATC CCG CTC GTA GGC GCC 4 8 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala 
1 5 10 15' 



CCC GTG GGT GGC GTC GCC AGA GCC CTG GAA CAT GGT GTT AGG GCT GTG 
Pro Val Gly Gly Val Ala Arg Ala Leu Glu His Gly Val Arg Ala Val 
20 25 30 

GAG GAC GGC ATC AAT TAT CCA ACA tSGG AAT CTC CCC GGT TGC TCT TTC 
Glu ASD Gly He Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

TCT ATC TAC CTC TTG GCA CTT CTC TCG TGC CTG ACT GTT CCC ACC TCG 



96 



144 



192 
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Ser He Tyr Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Tlir Ser 
50 55 60 

GCC ATC CAC TAT CGC AAT GCC TCG GGC GTC TAG CAC GTC ACC AAT GAC 240 
Ala He His Tyr Arg Asn Ala Ser Gly Val Tyr His Val Thr Asn Asp 
65 70 75 80 

TGC CCG AAC TCG AGC ATA GTG TAG GAG GCC GAC CkC CAC ATC CTA CAC 288 
Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp His His He Leu His 
85 90 95 

CTT CCA GGG TGC TTA CCC TGT GTG AGG GTT GGG AAT CAG TCA CGT TGT 33€ 
Leu Pro Gly Cys Leu Pro Cys Val Arg Val Gly Asn Gin Ser Arg Cys 
100 105 110 

TGG GTG GCC CTC TCT CCC ACC GTG GCG GCG CCT TAG ATC GGT GCT CCA 384 
Trp Val Ala Leu Ser Pro Thr Val Ala Ala Pro Tyr He Gly Ala Pro 
115 120 125 

GTT GAA TCC TTC CGG AGA CAC GTG GAC ATG ATG GTG GGC GCT GCT ACT 4 32 

Val Glu Ser Phe Arg Arg His Val As? Mec Met: Val Gly Ala Ala Thr 
130 135 140 

GTG TGC TCC GCT CTC TAT ATT GGG GAC TTG TGT GGT GGC GTA TTC TTG 4 80 

Val Cys Ser Ala Leu Tyr He Gly Asp Leu Cys Gly Gly Val Phe Leu 
145 150 155 16Q 

GTT GGT CAG ATG TTT TCT TTC CGG CCA CGA CGC CAC TGG ACT ACG CAG 52 8 

Val Gly Gin Met Phe Ser Phe Arg Pro Arg Arg His Trp Thr Thr Gin 
165 170 175 

GAC TGC AAT TGT TCC ATC TAC GCG GGG CAC ATC ACT GGC CAC GGA ATG 5 76 

Asp Cys Asn Cys Ser He Tyr Ala Gly H::.s He Thr Gly His Gly Met 
180 135 190 

GCA 579 
Ala 



(2) INFORMATION FOR SEQ ID NO: 198: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 193 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 198: 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lis Pro Leu Val Gly Ala 
15 10 15 

Pro Val Gly Gly Val Ala Arg Ala Leu Glu His Gly Val Arg Ala Val 
20 2S 30 

Glu Asp Gly He Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
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249 

40 



PCT/EP94/01323 

45 



Ser lie Tyx Leu 
50 

Ala lie His Tyr 
65 

Cys Pro Asn Ser 



Leu Pro Gly Cys 
100 

Trp Val Ala Leu 
115 

Val Glu Ser Phe 
130 

Val Cys Ser Ala 
145 

Val Gly Gin Mec 



Asp Cys Asn Cys 
180 

Ala 



Leu Ala Leu Leu 
55 

Arg Asn Ala Ser 
70 

Ser lie Val Tyr 
85 

Leu Pro Cys Val 



Ser Pro Thr Val 
120 

Arg Arg His Val 
135 

Leu Tyr lie Gly 
150 

Phe Ser Phe Arg 
165 

Ser lie Tyr Ala 



Ser Cys Leu Thr 
€0 

Gly Val Tyr His 
75 

Glu Ala Asp His 
90 

Arg Val Gly Asn 
105 

Ala Ala Pro Tyr 



Asp Met Met Val 
140 

Asp Leu Cys Gly 
155 

Pro Arg Arg His 
170 

Gly Hxs He Thr 
135 



Val Pro Thr Ser 



Val Thr Asn Asp 
80 

His lie Leu His 
95 

Gin Ser Arg Cys 
110 

He Gly Ala Pro 
125 

Gly Ala Ala Thr 



Gly Val Phe Leu 

16Q 

Trp Thr Thr Gin 
175 

Gly His Gly Met 
190 



(2) INFORMATION FOR SZQ ID NO: 199: 

(i) SHQUENCH CHARACTHRISTI CS : 

(A) LENGTH: 14 70 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
{D} TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(3) LOCATION: 2.. 1470 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 2.-14S7 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 199: 
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A TCA CCA CCG GAG CTT CTA TCA CAT ACT CCA CTT ACQ GCA AGT TCC 46 

Ser Pro Pro Glu Leu Leu Ser His Thr Pro Leu Thr Ala Ser Ser 
15 10 ^5 

TTG CTG ATG GAG GGT GTT CAG GCG GCG CGC ATG ACG TGA TCA TAT GCG 94 
Leu Leu Mec Glu Gly Val Gin Ala Ala Arg Mec Thr * Ser Ty Ala 
20 25 ^ %o 

ACG AGT GCC ATT CCC AGG ACG CCA CCA CCA TTC TTG GGA TAG GCA CTG 142 
Thr Ser Ala He Pro Arg Thr Pro Pro Pro Phe Leu Gly * Ala Leu 
3S 40 45 

TCC TTG ACC AGG CAG AGA CGG CTG GAG CTA GGC TCG TCG TCT TGG CCA 190 
Ser Leu Thr Arg Gin Arg Arg Leu Glu Leu Glv Ser Ser Ser T-p P-o 
SO 55 'so * * 

CGG CCA CCC CTC CCG GCA GTG TGA CAA CGC CCC ACC CCA ACA TCG AGG 238 
Arg Pro Pro Leu Pro Ala Val * Gin Arg Pra Thr Pro Thr Ser Ara 
65 70 75 * " 

AAG TGG CCC TGC CTC AGG AGG GGG AGG TTC CCT TCT ACG GCA GAG CCA 296 
Lys Trp Pro Cys Leu Arg Arg Gly Arg Phe Pr:; Ser Thr Ala Glu P-o 
80 85 9C * 95 



TTC CCC TTG CTT TTA TAA AGG GTG GTA GGC ATC TC^ TCT TCT GCC ATT 
Phe Pro Leu Leu Leu * Arg Val Val Gly He Ser Ser Ser Ala He 
100 105 110 



TGA ACG CCG TGG CAT ATT ATA GAG GTC TAG ACG TCG CCG TCA TAC CCA 
* Thr Pro Trp His He He Glu Val * Thr Ser Pro Ser T-zr Pro 
=L30 135 140 



334 



CCA AGA AAA AAT GTG ATG AAC TCG CCA AGC AAC TGA CCA GCC TGG GCG 3 82 

Pro Arg Lys Asn Val Mec Asn Ser Pro Ser Asn * Pro Ala Trp Ala 
115 120 125 



430 



CAA CAG GAG ACG TGG TCG TGT GCA GCA CCG ACG CGC TCA TGA CGG GAT 4 78 

Gin Gin Glu Thr Trp Ser Cys Ala Ala Pro Thr Arg Ser * Arg As? 
145 150 155 

TCA CCG GCG ACT TTG ATT CTG TCA TAG ACT GCA ACT CCG CCG TCA CTC 52 6 

Ser Pro Ala Thr Leu He Leu Ser * Thr Ala Thr Pro Pro Ser Leu 
160 165 170 175 

AGA CGG TGG ACT TCA GTC TGG ATC CCA CTT TTA CCA TTG AGA CTA CCA 5 74 

Arg Arg Trp Thr Ser Val Trp He Pro Leu Leu Pro Leu Arg Leu Pro 
180 185 190 

CAG TGC CCC AGG ACG CAG TGT CCA GAA GCC AGC GTT GGG GCC GCA CGG 622 

Gin Cys Pro Arg Thr Gin Cys Pro Glu Ala Ser Val Gly Ala Ala Arg 
195 200 205 

GGA GAG GTA GGC ACG GCA TAT ACC GGT ATG TCT CGG CTG GAG AGA GAC 6 70 

Gly Glu Val Gly Thr Ala Tyr Thr^'Gly Met Ser Arg Leu Glu Arg Asp 
210 215 220 

CGT CTG GCA TGT TCG ACT CCG TGG TGC TCT GTG AGT GCT ACG ATG CCG 718 

Arg Leu Ala Cys Ser Thr Pro Trp Cys Ser Val Ser Ala Thr Mec Pro 
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225 230 235 

GAT GTG CAT GGT ACG ATC TGA CTC CTG CCG AGA CTA CCG TGA GOT TGC 76 6 

Asp Val His Gly Thr lie ♦ Leu Leu Pro Arg Leu Pro ♦ Gly Cys 

240 245 250 255 

GCG CTT ACT AAA CAC CCC CGG GCT CCC TGT CTG TCA GGA CCA TTT GGA 814 

Ala Leu Thr Lys His Pro Arg Ala Pro Cys Leu Ser Gly Pre ?he Glv 
260 265 270 

ATT CTG GGA GGG GGT GTT CAC GGG GCT CAC TAA CAT CGA CGC TCA CAT 862 

lie Leu Gly Gly Gly Val His Gly Ala His * His Arg Arg Ser His 
275 290 235 

GCT GTC ACA GAC CAA ACA GGG TGG GGA GAA TTT CCC ATA CCT TGT AGC 910 

Ala Val Thr Asp Gin Thr Gly Trp Gly Glu Phe Pro lie Pro Cys Ser 

290 295 300 

GTA CCA AGC AAC AGT CTG TGT TCG CGC GAA AGC GCC CCC CCC GAG CTG 958 

Val Pro Ser Asn Ser Leu Cys Ser Arg Glu Ser Ala Pro Pro Gin Leu 

305 310 315 

GGA CAC AAT GTG GAA ATG CAT GCT CCG TCT CAA ACC GAC TTA ACT GGZ 10 0 6 

Gly Kis Asn Val Glu Met His Ala Pro Ser Gin Thr As? Leu Thr Gly 

320 325 330 335 



Pro Thr Pro Leu Leu Tyr Arg Leu Gly Pro Val Gin Asn Glu lie Thr 
340 345 350 



105^ 



CTG ACG CAC CCC ATC ACC AAG TAC ATT ATG GCT TGC ATG TCT GCG GAC 1102 
Leu Thr His Pro lie Thr Lys Tyr lie Mec Ala Cys Met: Ser Ala Asp 
355 360 365 

TTG GAG GTC ATT ACC AGC ACT TGG GTT CTG GTG GGG GGC GTT GTG GCG 115 0 

Leu Glu Val lie Thr Ser Thr Trp Val Leu Val Gly Gly Val Val Ala 
370 375 380 

GCC CTG GCG GCC TAC TGC TTG ACG GTG GGT TCG GTA GCC ATA GTC GGT 119 8 

Ala Leu Ala Ala Tyr Cys Leu Thr Val Gly Ser Val Ala lie Val Gly 
385 390 395 

AGG ATC ATC CTC TCT GGG AAA CCT GCC ATC ATT CCC GAT AGG GAG GTA 124 6 

Arg lie lie Leu Ser Gly Lys Pro Ala lie lie Pro Asp Arg Glu Val 
400 405 410 415 

TTA TAC CAG CAA TTT GAT GAG ATG GAG GAG TGC TCG GCC TCG TTG CCC 12 94 

Leu Tyr Gin Gin Phe Asp Glu Met Glu Glu Cys Ser Ala Ser Leu 'Pro 
420 425 430 

TAT ATG GAC GAA ACA CGT GCC ATT GCC GGA CAA TTC AAA GAG AAA GTG 13 4 2 

Tyr Mec Asp Glu Thr Arg Ala He Ala Gly Gin Phe Lys Glu Lys Val 
435 440 445 

CTC GGC TTC ATC AGC ACG ACC GGC CAG AAG GCT GAA ACT CTG AAG CCG 13 90 

Leu Gly Phe He Ser Thr Thr Gly Gin Lys Ala Glu Thr Leu Lys Pro 
450 455 460 



SUBSTITUTE SHEET (RULE 26) 



wo 94/25601 



PCT/EP94/0D23 



252 



GCA GCC ACQ TCT GTG TGG AAC AAG GCT GAG CAG TTC TGG CCA CAT ACA 
Ala Ala Thr Ser Val Trp Asn Lys Ala Glu Gin Phe Trp Pro Hxs Thr 
465 470 475 



1438 



TGT GGA ACT TCA TCA GTG GGA TAG AAT AAT AG 
Cys Gly Thr Ser Ser Val Gly Tyr Asa Asn 
430 485 



1470 



(2) INrOR^!ATION FOR SEQ ID NO : 197: 

(i) szQv^^icz charactehistics : 

(A) LENGTH: 14 8 5 base pa:irs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(3) LOCATION: 1..14a5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 197: 

TGTGC^CAGGA CCATCACCAC CGGAGCTTCT ATCAGkTACT CC=^CTTACGG CAAGTTCCTT 6 0 

GCTGATGGAG GGTGTTCAGG CGGCGCGCAT GACGTGATCA TATGCGACGA GTGCCATTCC 12 0 

CAGGACGCCA CCACCATTCT TGGGATAGGC ACTGTCCTTG ACCAGGCAGA GACGGCTGGA 18 0 

GCTAGGCTCG TCGTCTTGGC CACGGCCACC CCTCCCGGCA GTGTGACAAC GCCCCACCZC 24 0 

AACATCGAGG AAGTGGCCCT GCCTCAGGAG GC^GAGGTTC CCTTCTACGG CAGAGCCATT 3 00 

CCCCTTGCTT TTATAAAGGG TGGTAGGCAT CTCATCTTCT GCCATTCCAA GAAAAAATGT 3 60 

GATGAACTCG CCAAGCAACT GACCAGCCTG GGCGTGAACG CCGTGGCATA TTATAGAGGT 42 0 

CTAGACGTCG CCGTCATACC CACAACAGGA GACGTGGTCG TGTGCAGCAC CGACGCGCTC 480 

ATGACGGGAT TCACCGGCGA CTTTGATTCT GTCATAGACT GCAACTCCGC CGTCACTCAG 54 0 

ACGGTGGACT TCAGTCTGGA TCCCACTTTT ACCATTGAGA CTACCACAGT GCCCCAGGAC 6 00 

GCAGTGTCCA GAAGCCAGCG TTGGGGCCGC ACGGGGAGAG GTAGGCACGG CATATACCGG 66 0 

TATGTCTCGG CTGGAGAGAG ACCGTCTGGC ATGTTCGACT CCGTGGTGCT CTGTGAGTGC 72 0 

TACGATGCCG GATGTGCATG GTACGATCTG ACTCCTGCCG AGACTACCGT GAGGTTGCGC 78 0 
GCTTACNTAA ACACCCCCGG GCTCCCTGTC TGTCAGGACC ATTTGGAATT CTGGGAGGGG ^ 84 0 

GTGTTCACGG GGCTCACTAA CATCGACGCT CACATGCTGT CACAGACCAA ACAGGGTGGG 90 0 

GAGAATTTCC CATACCTTGT AGCGTACCAA GCAACAGTCT GTGTTCGCGC GAAAGCGCCC 96 0 
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CCCCCCAGCT GGGACACAAT GTGGAAATGC ATGCTCCGTC TCAAACCGAC NTTAACTGGC 1020 

CCTACTCCCC TCTTGTACAG GCTGGGGCCC GTCCAGAATG AGATCACACT GACGCACCCC 108 0 

ATCACCAAGT ACATTATGGC TTGCATGTCT GCGGACTTGG AGGTCATTAC CAGCACTTGG 114 0 

GTTCTGGTGG GGGGCGTTGT GGCC-GCCCTG GCGGCCTACT GCTTGACGGT GGCTTCGGTA 1200 

GCCATAGTCG GTAGGATCAT CCTCTCTGGG AAACCTGCCA TCATTCCCGA TAGGGAGGTA 12S0 

TTATACCAGC AATTTGATGA GATGGAGGAG TGCTCGGCCT CGTTGCCCTA TATGGACGAA 1320 

ACACGTGCCA TTGCCGGACA ATTCAAAGAG AAAGTGCTCG GCTTCATCAG CACGACCGGC 1380 

CAGAAGGCTG AAACTCTGAA GCCGGCAGCC ACGTCTGTGT GGAACAAGGC TGAGCAGTTC 144 0 

TGGNCCACAT ACATGTGGAA CTTCATCAGT GGGATACAAT AATAG .14 8 5 
(2) INFORMATION FOR SEQ ID NO: 198: 

(i) SEQUENCE CHA.RACTERISTICS . 

(A) LENGTH; 484 arr.ino acids 

(B) TYPE: antir.o acid 

(C) STPANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: procsin 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 198: 

Cys Ala Arg Thr lie Thr Thr Gly Ala Ser lie Thr Tyr Ser Thr Tyr 
15 10 15 

Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala His Asp Val 
20 25 30 

lie lie Cys Asp Glu Cys His Ser Gin Asp Ala Thr Thr lie Leu Gly 
35 40 45 

lie Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val 
50 55 60 

Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Thr Pro His Pro 
65 70 75 80 

Asn lie Glu Glu Val Ala Leu Pro Gin Glu Gly Glu Val Pro Phe Tyr 
85 90 95 

Gly Arg Ala lie Pro Leu Ala Phe lie Lys Gly Gly Arg Kis Leu lie 
100 105 110 

Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Lys Gin Leu Thr 
115 120 125 

Ser Leu Gly Val Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ala 
130 135 140 
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Val lie Pro Thr Thr Gly Asp Val Val Val Cys Ser Thr Asp Ala Leu 
145 150 155 160 

Mec Thr Gly Phe Thr Gly Asp Pha Asp Ser Val lie Asp Cys Asn Ser 
1S5 170 175 

Ala Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr lie 
180 185 190 



195 200 205 

Gly Arg Thr Gly Arg Gly Arg His Gly lie Tyr Arg Tyr Val Ser Ala 
210 215 220 

Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Val Val Leu Cys Glu Cys 
225 230 235 240 

Tyr Asp Ala Gly Cys Ala Trp Tyr Asp Leu Thr Pro Ala Glu Thr Thr 
245 250 255 

Val Arg Leu Arg Ala Tyr Xaa Asn Thr Pro Gly Leu Pro Val Cys Gin 
260 265 270 



Asp His Leu Glu Phe Trp Glu Gly Val Phe Thr Gly Leu Thr Asn lie 
275 230 235 

Asp Ala His Met: Leu Ser Gin Thr Lys Gin Gly Gly Glu Asn Phe Pro 
290 295 300 

Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Val Arg Ala Lys Ala Pro 
305 310 315 320 

Pro Pro Ser Trp Asp Thr Met: Trp Lys Cys Mec Leu Arg Leu Lys Pro 
325 330 335 

Xaa Leu Thr Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Pro Val Gin 
340 345 350 

Asn Glu lie Thr Leu Thr His Pro lie Thr Lys Tyr lie Met Ala Cys 
355 360 365 

Met Ser Ala Asp Leu Glu Val lie Thr Ser Thr Trp Val Leu Val Gly 
370 375 380 

Gly Val Val Ala Ala Leu Ala Ala Tyr Cys Leu Thr Val Gly Ser Val 
385 390 395 400 

Ala He Val Gly Arg lie He Leu Ser Gly Lys Pro Ala lie He Pro 
405 410 415 

Asp Arg Glu Val Leu Tyr Gin Gin Phe As? Glu Met Glu Glu Cys Ser 
420 425 ' 430 

Ala Ser Leu Pro Tyr Met Asp Glu Thr Arg Ala He Ala Gly Gin Phe 
435 440 ' 445 
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Lys Glu Lys Val Leu Gly Phe lie Ser Thr Thr Gly Gin Lys Ala Glu 

450 4S5 460 

Thr Leu Lys Pro Ala Ala Thr Ser Val Trp Asn Lys Ala Glu Gin Phe 
465 470 475 480 

Trp Xaa Thr Tyr 



(2) INFORMATION FOR SHQ ID NO : 19 5: 

(i) SE0UHNC3 CHARACTERISTICS: 

(A) LENGTH: 1485 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(IX) FEATURE: 

(A) NAME/KEY: CDS 

tH) LOCATION: 1..14 3 5 



(xi) SEQL-ENCE DESCRIPTION: 3EQ ID NO: 199: 

TGTGCCAGGA CCATCACCAC CGGAGCTTCT ATCACATACT CCACTTACGG CAAGTTCCTT 60 

GCTGATGGAG GGTGTTCAGG CGGCGCGTAT GACGTGATC^ TATGCGACGA GTGCCATTCC 12 0 

CAGGACGCCA CCACCATTCT TGGGATAGGC ACTG7CCTTG ACCAGGCAGA GACGGCTGGA 18 0 

GCTAGGCTCG TCGTCTTGGC CACGGCCACC CCTCCCGGCA GTGTGACAAC GCCCCACCCC 24 0 

AACATCGAGG AAGTGGCCCT GCCTCAGGAG GC-GGAGGTTC CCTTCTACGG CAGAGCCATT 3 00 

CCCCTTGCTT TTATAAAGGG TGGTAGGCAT CTCATCTTCT GCCATTCCAA GAAAAAATGT 3 60 

GATGAACTCG CCAAGCAACT GACCAGCCTG GGCGTGAACG CCGTGGCATA TTATAGAGGT 420 

CTAGACGTCG CCGTCATCCC CACAGCAGGA GACGTGGTCG TGTGCAGCAC CGACGCGCTC 48 0 

ATGACGGGAT TCACCGGCGA CTTTGATTCT GTCATAGACT GCAACTCCGC CGTCACTCAG 54 0 

ACGGTGGACT TCAGTCTGGA TCCCACTTTT ACCATTGAGA CTACCACAGT GCCCCAGGAC 600 

GCAGTGTCCA GAAGCCAGCG TAGGGGCCGC ACGGGGAGAG GTAGGCACGG CATATACCGG 660 

TATGTCTCGG CTGGAGAGAG ACCNTCTGAC ATGTTCGACT CCGTGGTGCT CTGTGAGTGC 720 

TACGATGCCG GATGTGCGTG GTATGATCTG ACTCCTGCCG AGACTACCGT GAGGTTGCGC 78 0 

GCTTACATAA ACACCCCCGG GCTCCCTGTC TGTCAGGACC ATTTGGAATT CTGGGAGGGG 84 0 

GTGTTCACGG GGCTCACTAA CATCGACGCT CACATGCTGT CACAGACCAA ACAGGGTGGG 900 

GAGAATTTNC CATACCTTGT AGCGTACCAA GCAACAGTCT GTGTTCGCGC GAAAGCGCCC 96 0 
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CCCCCCAGCT GGGACACAAT GTGOAAATGC ATGCTCCGTC TCAAACCGAC TTTAACTGGC 1020 

CCTACTCCCC TCTTGTACAG GCTGGGGCCC G7CCAGANTG AGATCACACT GACGCACCCC 1080 

ATCACCAAGT ACATTATGGC TTGCATGTCT GCGGACTTGG AGGTCATTAC CANCACTTGG 114 0 

GTTCTGGTGG GGGGCGTTGT GGCGGCCCTG GCGGCCtACT GCTTGACGGT GGGTTCGGTA 12 OC 

GCCATAGTCG GTAGGATCAT CCrCTCTGGG AAACCTGCCA TCATTCCCGA TAGGGAGGCA 1260 

TTATACCAGC AATTTGATGA GATGGAGGAG TGCTCGGCCT CGTTGCCCTA TATGGACGAG 1320 

ACACGTGCCA TTGCCGGACA ATTCAAAGAG AAAGTGCTCG GCTTCATCAG CACGACCGGC 13 80 

CAGAAGGCTG AAACTCTGAA GCCGGCAGCC ACGTCTGTGT GGAACAAGGC TGAGCAGTTC 144 0 

TGGGCCACAT ACATGTGGAA CTTCATCAGC GCGATACAAT AATAG 14 8 5 

(2) INFORMATION FOR SEQ ID NO : 200: 

(i) SHQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 84 ammo ac^ds 
(3) T^/?E : ammo acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: linear 

Cii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 200: 

Cys Ala Arg Thr lie Thr Thr Gly Aia Ser lie Thr Tyr Ser Thr Tyr 
15 10 15 

Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp Val 
20 25 30 

lie lie Cys Asp Glu Cys His Ser Gin Asp Ala Thr Thr lie Leu Gly 
35 . 40 45 

lie Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val 
50 55 60 

Val Leu Aia Thr Ala Thr Pro Pro Gly Ser Val Thr Thr Pro. His Pro 
65 70 75 80 

Asn He Glu Glu Val Ala Leu Pro Gin Glu Gly Glu Val Pro Phe Tyr 
85 90 95 

Gly Arg Ala lie Pro Leu Ala Phe He Lys Gly Gly Arg His Leu He 
100 105 110 

Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Lys Gin Leu Thr 
115 120 125 



SUBSTITUTE SHEET {RULE 26) 



wo 94/25601 PCr/EP94/01323 

257 

Ser Leu Gly Val Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asd Val Ala 
130 135 140 

Val He Pro Thr Ala Gly Asp Val Val Val Cys Ser Thr As:3 Ala Leu 
145 ISO 155 ' 160 

Met Thr Gly Phe Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Ser 
155 170 

Ala Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr lie 

180 185 T^go 

Glu Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg 
195 200 205 

Gly Arg Thr Gly Arg Gly Arg His Gly He Tyr Arg Tyr Val Ser Ala 
210 215 220 

Gly Glu Arg Xaa Ser Asp Met Phe Asp Ser Val Val Leu Cys Glu Cys 
225 230 235 240 

Tyr Asp Ala Gly Cys Ala Trp T^/r Asp Leu Thr Pro Ala Glu Thr Thr 
245 250 255 

Val Arg Leu Arg Ala Tyr He Asn Thr Pro Gly Leu Pro Val Cys Gin ' 
260 265 270 

Asp Kis Leu Glu Phe Trp Glu Gly Val Phe Thr Gly Leu Thr Asn He 
275 230 235 

Asp Ala His Met Leu Ser Gin Thr Lys Gin Gly Gly Glu Asn Xaa Pro 
290 295 300 

Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Val Arg Ala Lvs Ala Pro 
305 310 315 * 320 

Pro Pro Ser Trp Asp Thr Met Trp Lys Cys Met Leu Arg Leu Lys Pro 
325 330 335 

Thr Leu Thr Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Pro Val Gin 
340 345 350 

Xaa Glu He Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala Cys 
355 360 365 

Met Ser Ala Asp Leu Glu Val He Thr Xaa Thr Trp Val Leu Val Gly 
370 375 380 

Gly Val Val Ala Ala Leu Ala Ala Tyr Cys Leu Thr Val Gly Ser Val 
335 390 395 400 

Ala He Val Gly Arg He He Leu Ser Gly Lys Pro Ala He He Pro 
405 410 415 

Asp Arg Glu Ala Leu Tyr Gin Gin Phe Asp Glu Met Glu Glu Cys Ser 
420 425 ^ 430 

Ala Ser Leu Pro Tyr Met Asp Giu Thr Arg Ala He Ala Gly Gin Phe 
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435 440 445 

Lys Glu Lys Vai Leu Gly Phe He Ser Thr Thr Gly Gin Lys Ala Glu 
450 455 4S0 

Thr Leu Lys Pro Ala Ala Thr Ser Val Trp Asn Lys Ala Glu Gin Phe 
465 470 475 480 

Trp Ala Thr Tyr 
(2) INFORMATION FOR SEQ ID NO: 201: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 0 base pairs 

(B) TYPE: nucleic acxd 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI- SENSE: NO 



FEATURE : 

(A) NAME/KEY: 

(B) LOCATION: 

FEATURE : 

(A) NAME /KEY: 

(B) LOCATION: 



CDS 

2, .340 



mat:_peptide 
2 . .337 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 201: 

C TCC ACT GTG ACT GAG AGA GAC ATC AGG GTC GAA GAA GAA GTC TAT 45 
Ser Thr Val Thr Glu Arg Asp He Arg Val Glu Glu Glu Val Tyr 
15 10 15 

CAG TGT TGT GAT CTG GAG CCC GAG GCC CGC AAG GTA ATA ACC GCC CTC 94 
Gin Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Val He Thr Ala Leu 
20 25 30 

ACQ GAG AGA CTC TAC GTG GGC GGC CCT ATG TAG A\T AGC AAG GGA GAC 14 2 

Thr Glu Arg Leu Tyr Val Gly Gly Pro Met: Tyr Asn Ser Lys Gly Asp 
35 40 45 

CTT TGC GGG TAT CGC AGG TGC CGC GCA AGC GGC GTA TAT ACC ACC AGC 190 
Leu Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser 
50 55 60 

TTC GGG AAC ACA CTG ACG TGC TAC CTT AAA GCC TCA GCA GCC ATC AGG 23 8 

Phe Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala He Arg 
65 70 75 

GCT GCG GGG CTG AAG GAC TGC ACC ATG CTG GTT TGC GGT GAC GAC TTA ' 236 
Ala Ala Gly Leu Lys Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu 
80 85 90 95 
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11 J'''-; J €^ S f'j As ^^il^ \< i'l :1tj 
PCT/EP94/01323 



GTC GTG ATC OCT GAA AGC GGT GGC GTC GAG GAG GAC AAG CGA GCC CTC 3 34 

Val Val lie Ala Glu Ser Gly Gly Val Glu Glu Asp Lys Arg Ala Leu 
100 105 110 

GGA GCT ^'^O 
Gly Ala 



(2) INFORMATION FOR S2Q ID NO: 202: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LE^IGTH: 113 amino acids 
{3} TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: procein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 202: 

Ser Thr Val Thr Glu Arg Asp lie Arg Val Glu Glu Glu Val Tyr Gin 

IS 10 15 

Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Val lie Thr Ala Leu Thr 

20 25 30 

Glu Arg Leu Tyr Val Gly Gly Pro Met: Tyr Asn Ser Lys Gly Asp Leu 
35 40 45 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser Phe 
50 55 60 

Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala lie Arg Ala 
65 70 75 80 

Ala Gly Leu Lys Asp Cys Thr Mec Leu Vai Cys Gly Asp Asp Leu Val 
B5 90 95 

Val lie Ala Glu Ser Gly Gly Val Glu Glu As? Lys Arg Ala Leu Gly 
100 105 110 

Ala 



(2) INFORMATION FOR SEQ ID NO : 203: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: cDNA 
(iil) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 
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(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2 . , 340 

(ix) FEATURE: 

{A} NAME/KEY: mac^pept ide 

(B) LOCATION: 2 . .337 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 203: 

C TCC ACA GTG ACT GAA AGA GAC ATC AGG GTC GAG GAA GAG GTC TAC 4 6 

Ser Thr Val Thr Glu Arg Asp He Arg Val Giu Glu Glu Val Tyr 
15 10 15 

CAG TGT TGT GAC CTG GAG CCT GAA ACC CGC AAG GTA ATA TCT GTC CTC 94 
Gin Cys Cys Asp Leu Glu Pro Glu Thr Arg Lys Val lie Ser Ala Leu 
20 25 30 

ACT GAA AGA CTC TAT GTG GGC GGT CCC ATG CAC AAC AGC AGG GGA GAC 14 2 

Thr Glu Arg Leu Tyr Val Gly Gly Pro Met Asn Ser Arg Gly Asp 

35 40 43 

CTA TGC GGG TAC CGT AGA TGC CGC GCG AGC GGC GTA TAC ACC ACA AGC ISO 
Leu Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser 
50 55 60 

TTC GGG AAC ACT CTG ACG TGC TTC CTC AAG GCC ACA GCG GCC ACC AAA 23 8 

Phe Gly Asn Thr Leu Thr Cys ?he Leu Lys Ala Thr Ala Ala Thr Lys 
65 70 75 

GCC GOT GGC CTA AAG GAC TGC ACC ATG TTG GTG TGT GGT GAC GAC TTA 28 6 

Ala Ala Gly Leu Lys Asp Cys Thr Met Leu Val Cys Gly Asp As? Leu 
80 85 90 95 

GTC GTT ATC GCC GAA AGC GAT GGT GTC GAA GAG GAC CGC CGA GCC CTC 3 34 

Val Val He Ala Glu Ser Asp Gly Val Glu Glu Asp Arg Arg Ala Leu 
100 105 110 

GGA GCT 340 
Gly Ala 



(2) INFORMATION FOR SEQ ID NO: 204: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 113 ammo acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: Ixnear 

(ii) MOLECULE TYPE: protein 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 204; 

Ser Thr Val Thr Glu Arg Asp He Arc Val Glu Glu Glu Val Tyr Gin 
15 10 15 
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Cys Cys Asp Leu Glu Pro Glu Thr 
20 

Glu Arg Leu Tyr Val Gly Gly Pro 
35 40 

Cys Gly Tyr Arg Arg Cys Arg Ala 
50 55 

Gly Asn Tiir Leu Thr Cys Phe Leu 
65 70 

Ala Gly Leu Lys Asp Cys Thr Met 
85 

Val lie Ala Glu Ser As? Gly Val 
100 

Ala 



Arg Lys Val He Ser Ala Leu Thr 
25 30 

Met His Asn Ser Arg Gly Asp Leu 
45 

Ser Gly Val Tyr Thr Thr Ser Phe 
60 

Lys Ala Thr Ala Ala Thr Lys Ala 
75 80 

Leu Val Cys Gly Asp Asp Leu Val 
90 95 

Glu Glu Asp Arg Arg Ala Leu Gly 
105 110 



(2) INFORMATION FOR SEQ ID NO: 205: 

(i) ScQUENCr: CHAP-ACTHRISTICS : 

(A) LHNGTH: 34 0 base pairs 

(B) TY?H ; nucleic acid 

(C) STRANDEDNZSS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULH r^TPE : cDNA 
(xii) HYPOTHETICAL: NO 
(lii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2, .34 0 

(ix) FEATURE: 

(A) NAT^E/KEY: matjpeptide 

(B) LOCATION: 2.-33 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 205: 

C TCC ACG GTG ACC GAA AGG GAT ATC AGG ACC GAG GAA GAG ATC TAC 
Ser Thr Val Thr Glu Arg Asp lie Arg Thr Glu Glu Glu He Tyr 
15 10 15 

CAG TGC TGC GAC CTG GAG CCC GAA GCC CGC AAG GTG ATA TCC GCC CTA 
Gin Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Val He Ser Ala Leu 
20 25 30 

ACG GAA AGA CTC TAC GTG GGC GGT CCC ATG TAC AAC TCC AAG GGG GAC 
Thr Glu Arg Leu Tyr Val Gly Gly Pro Met Tyr Asn Ser Lys Gly Asp 
35 40 45 
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CTA TGC GGG CAA CGG AGG TGC CGC GCA 
Leu Cys Gly Gin Arg Arg Cys Arg Ala 
50 55 

TTC GGG AAC ACT GTA ACG TGT TAT CTC 
Phe Giy Asn Thr Val Thr Cys Tyr Leu 
65 70 

GCC GCA GGT CTG AAA GGT TGC AGC ATG 
Ala Ala Gly Leu Lys Giy Cys Ser Met 
80 85 

GTC GTC ATC TGC GAG AGC GGC GGC GTA 
Val Val He Cys Glu Ser Gly Gly Val 
100 

CGA GCC 
Arg Ala 



AGC GGG GTC TAC ACC ACC AGC 130 
Ser Gly Val Tyr Thr Thr Ser 
60 

AAG GCC Grr GCG GCT ACT AGG 238 
Lys Ala Val Ala Ala Thr Arg 



CTG GTT TGT GGA^GAC GAC TTA 286 
Leu Val Cys Gly Asd Asd Leu 
90 ' " 95 

GAG GAG GAT GCA AGA GCC CTC 3 34 

Glu Glu Asp Ala Arg Ala Leu 
105 110 

- 340 



(2) INFORMATION FOR SEQ ID NO: 206: 

(i) SEQtTENCE CHARACTERISTICS: 

(A) LENGTH; 113 amino acids 

(B) TYPE: ammo ac:Ld 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 205: 

Ser Thr Val Thr Glu Arg Asp He Arg Thr Glu Glu Glu He Ty^ Gin 
^5 10 I5 

Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Val lie Ser Ala Leu Thr 
20 25 30 

Glu Arg Leu Tyr Val Gly Giy Pro Met Tyr Asn Ser Lys Gly Asp Leu 
35 40 45 

Cys Gly Gin Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser Phe 
50 55 60 

Gly Asn Thr Val Thr Cys Tyr Leu Lys Ala Val Ala Ala Thr Arg Ala 
^5 70 75 80 

Ala Gly Leu Lys Gly Cys Ser Met Leu Val Cys Gly Asp Asp Leu Val 
85 90 95 

Val He Cys Glu Ser Gly Gly Val Glu Glu Asp Ala Arg Ala Leu Arg 
100 105 110 

Ala 

(2) INFORMATION FOR SEQ ID NO: 207: 
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(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 34 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 



(iii) HYPOTHETICAL: NO 



(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KKY: CDS 

(B) LOCATION: 2. .340 



(ix) FEATURE: 

{A) NAME/KEY: mat:_peptide 
(B) LOCATION: 2 . .337 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 207: 

C TCC ACG GTG ACT GAA AGG GAC ATT AGG GTC GAG GAA GAG ATC TAG 4 6 

Ser Thr Val Thr Glu Arg As? lie Arg Val Giu Glu Glu lie Tyr 
15 10 15 

CAG TGC TGT GAC CTG GAG CCC GAG GCA CGC AAG GTG ATA TCC GCT CTC 94 
Gin Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Val lie Ser Ala Leu 
20 25 30 

ACA GAA AGA CTC TAG AAG GGC GGC CCC ATG TAT AAC AGC AAG GGG GAC 14 2 

Thr Glu Arg Leu Tyr Lys Gly Gly Pro Met Tyr Asn Ser Lys Gly Asp 
35 40 45 

CTA TGC GGG CTT CGG AGG TGC CGC GCA AGC GGG GTA TAG ACC ACA AGC 190 
Leu Cys Gly Leu Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser 
50 55 60 

TTC GGG AAC ACG GTG ACA TGC TAG CTT AAA GCC ACA GCA GGC ACC AGG 23 3 

Phe Gly Asn Thr Val Thr Cys Tyr Leu Lys Ala Thr Ala Ala Thr Arg 
65 70 75 

GCT GCA GGG CTG AAA GAT TGC ACT ATG CTG GTA TGC GGT GAC GAC TTA 28 6 

Ala Ala Gly Leu Lys Asp Cys Thr Mec Leu Val Cys Gly Asp Asp Leu 
80 85 90 95 

GTC GTT ATT GCC GAA AGC GGT GGC GTG GAG GAG GAC GCC CGA GCC CTC 334 
Val Val lie Ala Glu Ser Gly Gly Val Glu Glu Asp Ala Arg Ala Leu 
100 105 110 

CGA GCC 3"^° 
Arg Ala 



(2) INFORMATION FOR SEQ ID NO: 208: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 113 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: procein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 208: 
3er Thr Vai Thr Giu Arg Asp lis Arg Val Glu Glu Giu He Tyr Gin 
1 5 10 15 * 

Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Val He Ser Ala Leu Thr 
20 25 30 

Glu Arg Leu Tyr Lys Gly Giy Pro Met: Tyr Asn Ser Lys Gly Asp Leu 
35 40 45 

Cys Gly Leu Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser Pne 
50 55 60 

Gly Asn Thr Val Thr Cys Tyr Leu Lys Ala Thr Ala Ala Thr Arcr Ala 
^5 70 75 ^ 60 

Ala Gly Leu Lys Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val 
85 90 ^ 95 

Val He Ala Glu Ser Gly Gly Val Glu Glu Asp Ala Arg Ala Leu Arg 
100 105 110 

Ala 



(2) INFORMATION FOR SEQ ID NO : 20 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 
(3) LOCATION: 1. .340 



(xi) SEQUENCE DESCRIPTION: "SEQ ID NO: 209: 
CCCCACCGTG ACNGAGAGGG ACNTCAGGGT CGAGGX2.GAG GTCTATCAGT GCTGTAATCT 6 
GGAGNCCGAT GNCCGCAAGG TCATCAACGC CCTCACAGAG AGACTCTACG TGGGCGGCCC 12 
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TATGCACAAC AGCAAGGGAG ACCTGTGTGG CATCCGTAGA TGCCGCGCGA GCGGCGTTTA 
CACCACGAGC TTCGGAAACA CGCTGACTTG CTACCTCAAA GCCACAGCGG CCACCAGGGC 
CGCGGGCTTG AAGGATTGCA CCATGCTGGT CTGCGOIGAC GACCTGGTTG TCATTGCTGA 
GAGCATTGGC ATAGACGAGG ACAAGCAAGC CCTCCGSlACr 
(2) INFOIU^ATION FOR SEQ ID NO: 210: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 113 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 
(lii) ANTI-SENSE: NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID KG: 210: 

Pro Thr Val Thr Glu Arg Asp Xaa Arg Val Glu Glu Glu Val T:/r Glr 
15 10 15 

Cys Cys Asn Leu Glu Xaa Asp Xaa Arg Lys Val lie Asn Ala Leu Thr 
20 25 30 

Glu Arg Leu Tyr Val Gly Gly Pro Met His Asn Ser Lys Gly Asc La' 
35 40 45 

Cys Gly lie Arc Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser Ph 
50 55 50 

Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Thr Ala Ala Thr Arc Al 
65 70 75 80 

Ala Gly Leu Lys Asp Cys Thr Mec Leu Val Cys Gly Asp Asp Leu Va 
85 90 95 

Val He Ala Glu Ser He Gly He Asp Glu Asp Lys Gin Ala Leu Ar 
100 105 110 

Thr 



(2) INFORMATION FOR SEQ ID NO: 211: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(iii) HYPOTHETICAL: NO 
(iii) ANTX-SENS2: NO 



(ix) FEATUtlZ: 

(A) NAME/KZY: CDS 

(B) LOCATION: 1. ;340 



(xi) SEQUENCE D^SC^IIPTION: SHQ ID NO: 211: 

CTCGACTGTG NCCGAGAGGG AC^TCAGGAC AGAGGGAGAG GTCTATCAGT GTTGCGACCT 60 

GGAACCGGAA GCCCGCAAGG TAATCACCGC CCTCACTGAG AGACTCTATG TGGGCGGACC 120 
CATGTTCAAC AGCAAGGGAG ACCTGTGCGG ACAACGCCGG TGCCGCGCAA GCGGCGTGTT - 19 0 

CACCACCAGC TTCGGGAACA CACTGACGTG CTACCTTAAA GCCACAGCTG CTACTAGAGC 24 0 

AGCCGGCTTA AAAGATTGCA CCATGCTGGT CTGCGGTGAC GACTTAGTCG TTATTTCCGA 3 00 

GAGCGCCGGT GTGGAGGAGG ATCCCANAAC CCNNCGACOr 34 0 

(2) INFORMATION FOR SEQ ID NO: 212: 

(i) SEQUENCE OiARACTERISTICS : 

(A) LENGTH: 113 amino acids 
CB) TYPE: amino acid 

(C) ST^ANDEDNESS ; si.ngle 

(D) TOPOLOGY: linear 

(ii) MOLECULE r/PE : cONA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 212: 

Ser Thr Val Xaa GLu Arg Asp lie Arg Thr Glu Gly Glu Val Tyr Gin 
15 10 15 

Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Val He Thr Ala Leu Thr 
20 25 30 

Glu Arg Leu T>'r Val Gly Gly Pro Met Phe Asn Ser Lys Gly Asp Leu 
35 40 45 

Cys Gly Gin Arg Arg Cys Arg Ala Ser Gly Val Pha Thr Thr Ser Phe 
SO 55 60 

Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Thr Ala Ala Thr Arg Ala 
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65 70 75 80 

Ala Gly Leu Lys Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val 

85 90 95 

Val lie Ser Glu Ser Ala Gly Val Glu Glu Asp Pro Xaa Thr Xaa Arg 

100 105 110 

Pro 



t2) INFORMATION FCH 3HQ ID NO: 213: 

(i) SEQUENCE CKAi^CTERXSTICS : 

(A) LENGTH: 340 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A} NAME/KEY: CDS 

(B) LOCATION: 2. .340 

fix) FEATURE: 

(A) NAME/KEY: mar_pepcide 

(3) LOCATION: 2 . . 337 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 213: 

C TCA ACA GTC ACC GAG AAC GAC ATC CGT GTT GAG GAG TCA ATT TAC 4 6 

Ser Thr Val Thr Glu Asn Asp lie Arg Val Glu Glu Ser lie Tyr 
15 10 15 

CAA TGT TGT GAC TTG GCC CCC GAG GCC AGA GAG GCC ATA AAG TCG CTC 94 
Gin Cys Cys Asp Leu Ala Pro Glu Ala Arg Gin Ala lie Lys Ser Leu 
20 25 30 

ACA GAG CGG CTT TAT ATC GGG GGT CCC CTG ACT AAT TCA AAG GGG CAG 14 2 

Thr Glu Arg Leu Tyr lie Gly Gly Pro Leu Thr Asn Ser Lys Gly Gin 
35 40 45 

AAC TGT GGC TAT CGC CGA TGC CGC GCA AGC GGC GTG CTG ACG ACC AGC 190 
Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser 
50 55 60 

TGC GGT AAT ACC CTT ACA TGT TAC CTA AAG GCC TCT GCA GCC TGT CGA 23 3 

Cys Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala Cys Arg 
65 70 75 

OCT GCG AAG CTC CAG GAC TGC ACG ATG CTC GTG TGC GGG GAC GAC CTT 286 
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Ala Ala Ly:3 Leu Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu 
80 as 90 ■ 95 

GTC GTT ATC TGT GAA AGC GCG GGA ACC CAA GAG GAC GCG GCG AGC CTA 3 34 

Val Val He Cys Glu Ser Ala Gly Thr Gin Glu Asp Ala Ala Ser Leu 

100 105 



C3A GTC 
Arg Val 



340 



(2) INFORMATION FOR SEQ ID NO : 214: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 113 amino acids 

(B) TYPE: ammo acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: procein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 214: 

Ser Thr Val Thr Glu Asn Asp He Arg Val Glu Glu Ser He Tyr Gin 
15 10 15 

Cys Cys Asp Leu Ala Pro Glu Ala Arg Gin Ala He Lys Ser Leu Thr 
20 25 ' 30 

Glu Arg Leu Tyr He Gly Gly Pro Leu Thr Asn Ser Lys Gly Gin Asn 
35 40 45 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys 
SO 55 60 

Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala Cys Arg Ala 
^5 70 75 80 

Ala Lys Leu Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val 
85 90 ^ 95 

Val He Cys Glu Ser Ala Gly Thr Gin Glu Asp Ala Ala Ser Leu Arg 
100 105 110 

Val 



(2) INFORMATION FOR SEQ ID NO: 215: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE T^f PE : cDNA 



SUBSTITUTE SHEET (RULE 26) 



wo 94/25601 

(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2. .340 

(ix) FEATURE: 

(A) NAME/KEY: mat:_pept:ide 

(B) LOCATION: 2 . , 340 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 215: 

C TCA ACC GTC ACG GAG AGG GAT ATA AGA ACA GAA GAA TCC ATA TAT 
Ser Thr Val Thr Glu Arg Asp lie Arg Thr Glu Glu Ser He Tyr 
15 10 5.5 

CAA GCT TGT TCC CTG CCC CAA GAG GCC AGA ACT GTC ATA CAC TCG CTC 
Gin Ala Cys Ser Leu Pro Gin Glu Ala Arg Thr Val He Kls Ser Leu 
20 25 3C 



ATG GGG AAT ACC ATG ACG TGT TAC ATC AAA GCC CTT GCA GCG TGT AA::. 
Met Gly Asn Thr Met Thr Cys Tyr He Lys Ala Leu Ala Ala Cys Lvs 
65 70 75 



46 



94 



ACC GAG AGA CTC TAC GTG GGA GGG CCC ATG ATA AAC AGC AAA GGG CAA 14 2 

Thr Glu Arg Leu Tyr Val Gly Gly Pro Met He Asn Ser Lys Gly Gin 
35 40 45 

TCC TGC GGT TAC AGG CGT TGC CGC GCA AGC GGT GTT TTC ACC ACC AGC 190 

Ser Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Phe Thr Thr Ser 
50 55 ' 60 



238 



GCC GCA GGG ATC GTG GAC CCC GTC ATG CTG GTG TGT GGA GAC GAC CTG 2 86 

Ala Ala Gly He Val Asp Pro Val Met Leu Val Cys Gly Asp Asu Leu 
80 85 90 ^ ' ^ ' 95 

GTC GTC ATC TCG GAG AGC CAG GGT AAC GAG GAG GAC GAG CGA AAC CTG 3 34 

Val Val He Ser Glu Ser Gin Gly Asn Glu Glu Asp Glu Arg Asn Leu 
100 105 110 

AGA GCT 34 0 

Arg Ala 



(2) INFORMATION FOR SEQ ID NO: 216: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 113 ammo acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 216: 

Ser Thr Val Thr Glu Arg Asp lie Arg Thr Glu Glu Ser lie Tyr Gin 
15 10 15 

Ala Cys Ser Leu Pro Gin Glu Ala Arg Thr Val He His Ser Leu Thr 
20 25 30 

Glu Arg Leu Tyr Val Gly Gly Pro MeC He Asn Ser Lys Gly Gin Ser 
35 40 45 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Phe Thr Thr Ser Mec 
50 55 60 

Gly Asn Thr Met Thr Cys Tyr He Lys Ala Leu Ala Ala Cys Lys Ala 
^5 70 75 QQ 

Ala Gly He Val Asp Pro Val Met: Leu Val Cys Gly Asp Asp Leu Val 
85 90 95 

Val He Ser Glu Ser Gin Gly Asn Glu Glu Asp Glu Arg Asn Leu Arg 
100 105 110 

Ala 



(2) INFORMATION FOR SEQ ID NO: 217: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNE3S : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(3) LOCATION: 2.. 340 

(ix) FEATURE; 

(A) NAME/KEY: mac_peptide 

(B) LOCATION: 2., 340 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 217: 

C TCG ACT GTC ACT GAA CAG GAC ATC AGG GTG GAA GAG GAG ATA TAT 4 5 

Ser Thr Val Thr Glu Gin Asp He Arg Val Glu Glu Glu He Tyr 
15 10 15 

CAA TGC TGC AAC CTT GAA CCG GAG GCC AGG AAA GTG ATC TCC TCC CTC 94 
Gin Cys Cys Asn Leu Glu Pro Glu Ala Arg Lys Val He Ser Ser Leu 
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20 25 30 

ACG GAG CGG CTT TAC TGC GGA GGC CCT ATG TTT AAC AGC AAG GGG GCC 142 
Thr GIu Arg Leu Tyr Cys Gly Giy Pro Met Phe Asn Ser Lys Gly Ala 
35 40 45 

CAG TGT GGT TAT CGC CGT TGC CGT GCC AGT GGA GTT CTG CCT ACC AGC 190 
Gin Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Pro Thr Ser 
50 55 60 

TTT GGC AAC ACA ATC ACT TGT TAC ATC AAG GCC ACA ACG GCC GCG AAG 23 8 

Phe Gly Asn Thr lis Thr Cys Tyr lie Lys Ala Thr Thr Ala Ala Lys 
65 70 75 

GCC GCA GGC CTC CGG AAC CCG GAC TTT CTT GTC TGC GGA GAT GAT CTG 28 6 

Ala Ala Gly Leu Arg Asn Pro Asp Phe Leu Val Cys Gly Asp Asp Leu 
80 85 90 95 

GTC GTG GTG GCT GAG AGT GAT GGC GTC GAC GAG GAT AGA GCA GCC CTG 3 34 

Val Val Val Ala Glu Ser Asp Gly Val Asp Glu Asp Arg Ala Ala Leu 

100 105 110 

AGA GCC 3 40 

Arg Ala 



(2) INFORMATION FOR SEQ ID NO: 213: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 113 ammo acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: procein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 219: 

Ser Thr Val Thr Glu Gin Asp lie Arg Val Glu Glu Glu lie Tyr Gin 
1 5 10 15 

Cys Cys Asn Leu Glu Pro Glu Ala Arg Lys Val He Ser Ser Leu Thr 
20 25 30 

Glu Arg Leu Tyr Cys Gly Gly Pro Met Phe Asn Ser Lys Gly Ala Gin 
35 40 45 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Pro Thr Ser Phe 
50 55 60 

Gly Asn Thr He Thr Cys Tyr He Lys Ala Thr Thr Ala Ala Lys Ala 
65 70 75 80 

Ala Gly Leu Arg Asn Pro Asp Phe Leu Val Cys Gly Asp Asp Leu Val 
85 90 95 

Val Val Ala Glu Ser Asp Gly Val Asp Glu Asp Arg Ala Ala Leu Arg 
100 105 110 
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Ala 

(2) INFORMATION FOR SEQ ID NO: 219: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNHSS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 219: 

Arg Ser Glu Gly Arg Thr Ser Trp Ala Gin 
15 10 

(2) INFORMATION FOR SEQ ID NO : 22 0: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 220: 

Arg Ser Glu Gly Arg Thr Ser Trp Ala Gin 
15 10 

(2) INFORMATION FOR SEQ ID NO: 221: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 
(3) TYPE: amino acid 
(C) STRANDEDNESS: single 
ID) TOPOLOGY: linear 

(li) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 221: 

Arg Thr Glu Gly Arg Thr Ser Trp Ala Gin 
15 10 



(2) INFORMATION FOR SEQ ID NO: 222: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 29 base oairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULS TYPE: cDNA 
(iiil HYPOTHSTICAX-: NO 
(iii) ANTI-SENSE: NO 



FEATURE: 

(A) NAME/KEY: 

(B) LOCATION: 

FEATURE: 

(A) NAME/KEY: 

(B) LOCATION: 



CDS 

3 . .€29 



mat_peptlde 
3. .629 



(xi) SEQUENCE DESCRIPTION: SZQ ID NO: 222: 

TA GAC TTT TGG GAG AGC GTC TTC ACT GGA CTA ACT CAC ATA GAT GCC 4 7 

Asp Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His lie Asp Ala 
15 10 15 

CAC TTT CTG TCA CAG ACT AAG CAG CAG GGA CTC AAC TTC TCG TTC CTG 9 5 

Hjls Phe Leu Ser Gin Thr Lys Gin Gin Gly Leu Asn Phe Ser Phe Leu 
20 25 30 

ACT GCC TAG CAA GCC ACT GTG TGC GCT CGC GCG CAG GCT CCT CCC CCA 14 3 

Thr Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro 
35 40 45 

AGT TGG GAC GAG ATG TGG AAG TGT CTC GTA CGG CTT AAG CCA ACA CTA 191 
Ser Trp A3p Glu Met Trp Lys Cys Leu Val Arg Leu Lys Pro Thr Leu 
50 55 60 

CAT GGA CCT ACG CCT CTT CTA TAT CGG TTG GGG CCT GTC CAA AAT GAA 23 9 

His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Pro Val Gin Asn Glu 
65 70 75 

ATC TGC TTG ACA CAC CCC ATC ACA AAA TAG ATC ATG GCA TGC ATG TCA 287 
lie Cys Leu Thr His Pro lie Thr Lys Tyr lie Met Ala Cys Me- Ser 
80 B5 90 95 

GCT GAT CTG GAA GTA ACC ACC AGC ACC TGG GTT TTG CTT GGA GGG GTC 33 5 

Ala Asp Leu Glu Val Thr Thr Ser Thr Trp Val Leu Leu Gly Gly Val 
100 105 110 

CTC GCG GCC CTA GCG GCC TAC TGC TTG TCA GTC GGT TGT GTT GTG ATT 333 
Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Val Gly Cys Val Val lie 
115 120 125 



GTG GGT CAT ATC GAG CTG GGG GGC AAG CCG GCA ATC GTT CCA GAC AAA 
Val Gly His lie Glu Leu Gly Gly Lys Pro Ala lie Val Pro Asp Lys 
130 135 140 



431 
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GAG GTG TTG TAT CAA CAA TAG GAT GAG ATG GAA GAG TGC TCA CAA GCT 479 

Giu Val Leu Tyr Gin Gin Tyr Asp Glu Mec Glu Glu Cys Ser Gin Ala 
145 150 155 

GCC CCA TAT ATC GAA CAA GCT CAG GTA ATA GCT CAC CAG TTC AAG GAA 527 

Ala Pro Tyr He Glu Gin Ala Gin Val lie Ala His Gin Phe Lys Glu 

ISO 165 170 175 

AAA GTC CTT GGA TTG CTG CAG CGA GCC ACC CAA CAA CAA GCT GTC ATT 575 

Lys Val Leu Gly Leu Leu Gin Arg Ala Thr Gin Gin Gin Ala Val He 
ISO 135 190 

GAG CCC ATA GTA ACT ACC AAC TGG CAA AAG CTT GAG GCC TTT TGG CAC 623 

Giu Pro He Val Tiir Thr Asn Trp Gin Lys Leu Glu Ala Phe Trp His 
195 200 205 



AAG CAT 

Lys His 



629 



(2) INFORMATION FOR SZQ ID NC : 223: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 09 ammo acids 

(B) TYPE: amine acid 
(D) TOPOLOGY: linear 



(xi) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCP-IPTION: SEQ ID NO: 223: 



Asp Phe Trp Glu 

1 

Phe Leu Ser Gin 
20 

Ala Tyr Gin Ala 
35 

Trp Asp Glu Met: 
50 

Gly Pro Thr Pro 
65 

Cys Leu Thr His 



Asp Leu Glu Val 
100 

Ala Ala Leu Ala 
115 

Gly His He Glu 
130 



Ser Val Phe Thr 
5 

Thr Lys Gin Gin 



Thr Val Cys Ala 
40 

Trp Lys Cys Leu 
55 

Leu Leu Tyr Arg 
70 

Pro He Thr Lys 
85 

Thr Thr Ser Thr 



Ala Tyr Cys Leu 
120 

Leu Gly Gly Lys 
135 



Gly Leu Thr His 
10 

Gly Leu Asn Phe 
25 

Arg Ala Gin Ala 



Val Arg Leu Lys 
60 

Leu Gly Pro Val 
75 

Tyr He Met Ala 
90 

Trp Val Leu Leu 
105 

Ser Val Gly Cys 



Pro Ala He Val 
140 



He Asp Ala His 
15 

Ser Phe Leu Thr 
30 

Pro Pro Pro Ser 
45 

Pro Thr Leu His 



Gin Asn Glu He 
80 

Cys Met Ser Ala 
95 

Gly Gly Val Leu 
110 

Val Val He Val 
125 

Pro Asp Lys Glu 
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Val Leu Tyr Gin Gin Tyr Asp Glu Met Glu Glu Cys Ser Gin Ala Ala 
145 ISO 1S5 ISO 

Pro Tyr lie Glu Gin Ala Gin Val He Ala His Gin Phe Lys Glu Lys 
165 170 175 

Val Leu Gly Leu Leu Gin Arc Ala Thr Gin Gin Gin Ala Val He Glu 
180 135 190 

Pro He Val Thr Thr Asn Trp Gin Lys Leu Glu Ala Phe Trp His Lys 
195 2C0 205 

His 



(2) INFORMATION FOR SEQ ID NO: 224: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE T'/PE : pepcids 



(ix) FEATURE: 

(A) NAME/KEY: Pepcide 
(3) LOCATION: 2 , . 12 



(xi} SEQUENCE DESCRIPTION: SEQ ID NO: 224: 

He His Tyr Arg Asn Ala Ser Gly He Tyr His He 
15 10 

(2) INFORMATION FOR SEQ ID NO: 225: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 225: 

Val Asn Tyr Arg Asn Ala Ser Gly He Tyr His He 
1 5 10 

(2) INFORMATION FOR SEQ ID NO : 5: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 12 amino acids 

(3) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 226: 

Vai Asn Tyr Arg Asn Ala Ser Gly Val Tyr His He 
15 10 

(2) INFORMATION FOR SEQ ID NO: 227: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3) T^i-PE: amine acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii} MOLECULE TYPE: pepcide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 227: 

Val Asn Tyr His Asn Thr Ser Gly He Tyr His Leu 
15 10 

(2) INFORMATION FCR SEQ ID NO: 228: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3) TYPE: amino acid 

(C) STRAJIDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE T^^PE : pepcide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 223: 

Gin His Tyr Arg Asn Ala Ser Gly He Tyr His Val 
15 10 

(2) INFORMATION FOR SEQ ID NO: 229: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3) TYPE: amino acid 
(C) STRANDEDNESS: Single 
{D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22 9: 
Gin His Tyr Arg Asn Val Ser Gly lie Tyr His Vai 



(2) INFORMATION FOR SEQ 10 NO : 23 0: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 230: 

lie His Tyr Arg Asn Ala Ser Asp Gly Tyr Tyr lie 
15 10 

(21 INFORMATION FOR SEQ ID NO : 231: 

(i) SEQL-ENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: pepcide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 231: 

Leu Gin Val Lys Asn Thr Ser Ser Ser Tyr Met Val 
15 10 

(2) INFORMATION FOR SEQ ID NO: 232: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 232: 

Val Trp Gin Leu Arg Ala lie Val Leu His Val 
15 10 

(2) INFORMATION FOR SEQ ID NO: 23 3: 



1 



5 



10 
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{i} SEQUENCE CHARACTERISTICS: 

(A) LENGTH: IX amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MCLECULZ TYPE: pepcide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 233: 

Val Tyr Glu Ala Asp Tyr His lie Leu His Leu 
15 10 

(2) INFORMATION FOR SEQ ID NO : 234: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 234: 

Val Tyr Glu Thr Asp Asn His lie Leu His Leu 
^ 5 10 

(2) INFORMATION FOR SEQ ID NO: 235: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 
(3) TYPE: ammo acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 235: 

Val Tyr Glu Thr Glu Asn His He Leu His Leu 
15 10 

(2) INFORMATION FOR SEQ ID NO: 236: 

(l) SEQL-ENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 
(3) TYPE: ammo acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRrPTION : SEQ ID NO: 236: 

Val Phe Glu Thr Val His His He Leu His Leu 
IS 10 

(2) INFCR-MATrON FOR SEQ ZD NO: 23 7: 

(i) SEQXJENCE CHARACrERISTICS : 

(A) LENGTH: 11 amino acids 

(B) TYPE: ammo acid 

(C) STx^ANCEDNESS : single 

(D) TOPOLOGY: linear 

(ix) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESC^IIPTION: SEQ ID NO: 237: 

Val Phe Glu Thr Glu Kis His He Leu His Leu 
15 10 

(2) INFORMATrON FOR SEQ ID NO : 233: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNES3 : Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 238: 

Val Phe Glu Thr Asp His His He Met His Leu 
15 10 

(2) INFORMATION FOR SEQ ID NO: 23 9: 

( 1 ) S EQUENCE CHARACTER I ST I CS : 

(A) LENGTH: 11 ammo acids 

(B) TYPE: ammo acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: "SEQ ID NO: 239: 

Val Tyr Glu Thr Glu Asn His He Leu His Leu 
15 10 



SUBSTITUTE SHEET (RULE 26) 



wo 94/25601 



280 



(2) INFORMATION FOR SEQ ID NO: 240: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STHANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 240: 

Val Tyr Glu Ala Asp Ala Leu lie Leu His Ala 
15 10 

(2) INFORMATION FOR SEQ ID NO: 241: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 241: 

Val Gin Asp Gly Asn Th:;: Ser Ala Cys Trp Thr Pro Val 
15 10 

(2) INFORMATION FOR SEQ ID NO: 242: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 242: 
Val Lys Thr Gly Asn Gin Ser Arg Cys Trp Val Ala Leu 
15 10 

(2) INFORMATION FOR SEQ ID NO: 243: 

(i) SEQUENCE CHA?Ji.CTERISTICS : 

(A) LENGTH: 13 amino acids 

(B) T'^PE : amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLEOTLH TYPE: peptide 



(Xi) SEQUENCE DESCrilPTICN : SEQ ID NO: 243: 

Val Lys Thr Gly Asn Gin Ser Axg Cys Trp Val Ala Ueu 
15 10 

(2) INFORMATION FOR SEQ 10 NO: 244: 

(i) SEQUENCE CHARACTE^IISTICS : 

(A) LENGTH: 13 amino acids 

(B) TYPE: amxn.c acid 

(C) STRANDEDNES3 : Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQL^NCE DESCRIPTION: SHQ ID NO: 244: 

Val Arg Thr Gly Asn Gin Ser Arg Cys Trp Val Ala Leu 
15 10 

(2) INFORMATION FOR SEQ 13 NO: 245: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 ammo acids 

(B) TYPE: amine acid 

(C) STRANDEDNE35 : single 

(D) TOPOLOGY: linear 

(ii) MOLECTLE TYPE: peptide 



(xi) SEQUENCE DESCRIrTION: SEQ ID NO: 245: 

Val Lys Thr Gly Asn Gin Ser Arg Cys Trp lie Ala Leu 
15 10 

(2) INFORMATION FOR SEQ ID NO: 246: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 ammo acids 

(B) TYPE : ammo acid 

(C) STRANDEDNE33 : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 246: 
Val Lys Thr Gly Asn Gin Ser Arg Cys Trp lie Ala Leu 
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15 10 
(2) INFORMATION FOR SEQ ID NO: 24 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 am^no acids 

(B) TYPE: anu.no acid 

iC) STRA:JDEI;NES3 : single 
(D) TOPOLOGY: linear 

ill) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 247: 

Val Lys Thr Gly Asn Ser Val Arg Cys Trp lie Pro Leu 
1 5 * lo" 

(2) INFORMATION FOR SEQ ID NO: 243: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 248: 

Val Lys Thr Gly Asn Val Ser Arg Cys Trp He Ser Leu 
15 10 

(2) INFORMATION FOR SEQ ID NO : 24 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 249: 

Val Arg Lys Asp Asn Val Ser Arg Cys Trp Val Gin He 
15 10 

(2) INFORMATION FOR SEQ ID NO: 2 50: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 ammo acids 

(B) TYPE: ammo acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(Xl) SEQL^NCE DESCRIPTION: 3ZQ ID NO: 25C : 

Ala Pro Ser Phe Gly Ala Val Thr Ala Pro 
IS 10 
(2) INrORMATION FOR SEQ ID NO: 251: 

(i) SEQUENCE CiARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNES3 : single 

(D) TOPOLOGY: linear 

(ix) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 251: 

Val Ser Gin Pro Gly Ala Leu Thr Lys Gly 
15 10 

(2) INFORMATION FOR SEQ ID NO: 252: 

(i) SEQUENCE OIARACTERISTICS : 

(A) LENGTH: 10 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE Tt^E : peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 252: 

Val Lys Tyr Val Gly Ala Thr Thr Ala Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO: 25 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 253: 
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Ala Pro Tyr lie Gly Ala Pro Val Glu Ser 



(2) INFORMATION FOR SEQ ID NO: 254: 

(1) SEQUENCE CHAiUVCTERISTICS : 

(A) LENGTH: 10 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNHSS : Single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 254: 

Ala Gin His Leu Asn Ala Pro Leu Glu Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO; 25 5: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 ammo acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 255: 

Ser Pro Tyr Val Gly Ala Pro Leu Glu Pro 
15 10 

(2} INFORMATION FOR SEQ ID NO: 256; 

(il SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

CO STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 256: 

Ser Pro Tyr Ala Gly Ala Pro Leu Glu Pro 
15 10 

(2) INFORMATION FOR SEQ ID NO: 2 57: 

(i) SEQUENCE CHAR-^CTERISTICS ; 

(A) LENGTH: 10 ammo acids 



1 



5 
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(B) TYPH: ammo acxd 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 



(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 257: 

Ala Pro Tyr Leu Gly Ala Pro Leu Glu Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO: 253: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 ammo acids 

(B) TYPE; amino acid 

(C) STRANDEDNESS: smgle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 253: 

Ala Pro Tyr Leu Gly Ala Pro Leu Glu Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO : 2 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 ammo acids 
(3) TYPE: ammo acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TfPZ: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 259: 

Ala Pro Tyr Val Gly Ala Pro Leu Glu Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO: 260: 

(i)' SEQL-ENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SSQ ID NO: 260: 

Asn Val Pro Tyr Leu Gly Ala Pro Leu Thr Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO: 261: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 anixno acids 

(B) TYPE: amine acid 

(C) STRANDEDNE3S : Singia 

(D) TOPOLOGY: l:Lnear 

(ii) MOLECLXE TYPE: peptide 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 261: 

Ala Pro His Leu Arg Ala Pro Leu Ser Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO: 252: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 a-:Lno acids 

(B) TYPE: ammo acid 

(C) STRANDEDNES3 : single 

(D) TOPOLOGY: linear 

(ii} MOLECTHE T-^PH: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 262: 

Ala Pro Tyr Leu Gly Ala Pro Leu Thr Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO: 253: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 aniino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: pepcide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 263: 

Arg Pro Arg Gin Kis Ala Thr Val Gin Asp 
15 10 

(2) INFORMATION FOR SEQ ID NO: 25 4: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 264: 

Ser Pro Gin His His Lys Phe Val Gin Asp 
IS 10 

(2) INFORMATION FOR SEQ ID NO: 2S5: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 10 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 265: 

Arg Pro Arg Arg Leu Trp Thr Thr Gin Glu 
15 10 

(2) INFORMATION FOR SEQ ID NO: 266: 

(i} SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 
(3) TYPE: ammo acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 266: 

Pro Pro Arg lie His Glu Thr Thr Gin Asp 
15 10 

(2) INFORMATION FOR SEQ ID NO: 267: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 
(3) TYPE: ammo acid 

(C) STRANDEDNESS: smgle 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 267: 

Thr lie Ser Tyr Ala Asn Gly Ser Gly Pro Ser Asp Asp Lys 
1 ' S 10 

(2) INFORMATION FOR SEQ ID NO: 263: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 
(3) TYPE: amine acxd 
(C) STRANDEDNESS : single 
• (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 263: 

Ser Arg Arg Gin Pro He Pro Arg Ala Arg Arg Thr Glu Gly Arg Sei 
IS 10 15 

Trp Ala Gin 



{2} INFORMATION FOR SEQ ID NO: 26 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1443 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS^: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI -SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1. .1443 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 1 . . 1443 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 269: 

ACC ATC ACC ACC GGA GCT TCT ATC ACA TAC TCC ACT TAC GGC AAG TTC 
Thr He Thr Thr Gly Ala Ser He Thr Tyr Ser Thr Tyr Gly Lys Phe 
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15 10 15 

CTT GCT GAT GGA GGG TGT TCA GGC GGC GCG TAT GAC GTG ATC ATA TGC 96 
Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp Val lie lie Cys 
20 25 30 

GAC GAG TGC CAT TCC CAG GAC GCC ACC ACC ATT CTT GGG ATA GGC ACT 144 
Asp Glu Cys His Ser Glr. Asp Ala Thr Thr He X*eu Gly lie Gly Thr 
35 40 45 

GTC CTT GAC CAG GCA GAG ACG GCT GGA GCT AGG CTC GTC GTC TTG GCC 192 
Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
50 55 60 

ACG GCC ACC CCT CCC GGC AGT GTG ACA ACG CCC CAC CCC AAC ATC GAG 24 0 

Thr Ala Thr Pro Pro Gly Ser Val Thr Thr Pro Hxs Pro Asn lie Glu 
65 70 75 80 

GAA GTG GCC CTG CCT CAG GAG GGG GAG GTT CCC TTC TAC GGC AGA GCC 28 8 

Glu Val Ala Leu Pro Gin Glu Gly Glu Val Pro Phe Tyr Gly Arg Ala 
85 90 93 

ATT CCC CTT GCT TTT ATA AAG GGT GGT AGG CAT CTC ATC TTC TGC CAT 336 
lie Pro Leu Ala Phe He Lys Gly Gly Arg His Leu lie Phe Cys His 

100 105 110 

TCC AAG AAA AAA TGT GAT GAA CTC GCC AAG CAA CTG ACC AGC CTG GGC 3 84 

Ser Lys Lys Lys Cys Asp Glu Leu Ala Lys Gin Leu Thr Ser Leu^ Gly 
115 120 125 

GTG AAC GCC GTG GCA TAT TAT AGA GGT CTA GAC GTC GCC GTC ATC CCC 432 
Val Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ala Val He Pro 
130 135 140 

ACA GCA GGA GAC GTG GTC GTG TGC AGC ACC GAC GCG CTC ATG ACG GGA 4 80 

Thr Ala Gly Asp Val Val Val Cys Ser Thr Asp Ala Leu Met Thr Gly 
145 ISO 155 160 

TTC ACC GGC GAC TTT GAT TCT GTC ATA GAC TGC AAC TCC GCC GTC ACT 528 
Phe Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Ser Ala Val Thr 
165 170 175 

CAG ACG GTG GAC TTC AGT CTG GAT CCC ACT TTT ACC ATT GAG ACT ACC 576 
Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr Thr 
180 185 190 

ACA GTG CCC CAG GAC GCA GTG TCC AGA AGC CAG CGT AGG GGC CGC ACG 624 
Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly Arg Thr 
195 200 205 

GGG AGA GGT AGG CAC GGC ATA TAC CGG TAT GTC TCG GCT GGA GAG AGA 6 72 

Gly Arg Gly Arg His Gly He Tyr Arg Tyr Val Ser Ala Gly Glu Arg 
210 215 220 



CCG TCT GAC ATG TTC GAC TCC GTG GTG CTC TGT GAG TGC TAC GAT GCC 
Pro Ser Asp Met: Phe Asp Ser Val Val Leu Cys Glu Cys Tyr Asp Ala 
225 230 235 240 



720 
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GGA TGT GCG TGG TAT GAT CTG ACT CCT GCC GAG ACT ACC GTG AGG TTG 76 3 

Gly Cys Ala Trp Tyr Asp Leu Thr Pro Ala Giu Thr Thr Val Arg Leu 
245 250 255 

CGC GCT TAG ATA AAC ACC CCC GGG CTC CCT GTC TGT CAG GAC CAT TTG 815 
Arg Ala Tyr lie Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu 
260 265 270 

GAA TTC TGG GAG GGG GTG TTC ACG GGG CTC ACT AAC ATC GAC GCT CAC 364 
Glu Phe Trp Glu Gly Val Phe Thr Gly Leu Thr Asn lie Asp Ala His 
275 280 285 

ATG CTG TCA CAG ACC AAA CAG GGT GGG GAG AAT TTC CCA TAC CTT GTA 912 
Met: Leu Ser Gin Thr Lys Gin Gly Gly Glu Asn Phe Pro Tyr Leu Val 
290 295 300 

GCG TAC CAA GCA ACA GTC TGT GTT CGC GCC- AAA GCG CCC CCC CCC AGC 96 0 

Ala Tyr Gin Ala Thr Val Cys Val Arg Ala Lys Ala Pro Pro Pro Ser 
305 310 315 320 

TGG GAC ACA ATG TGG AAA TGC ATG CTC CGT CTC AAA CCG ACT TTA ACT 10 08 

Trp Asp Thr Met Trp Lys Cys Mec Leu Arg Leu Lys Pro Thr Leu Thr 

325 33C 335 

GGC CCT ACT CCC CTC TTG TAC AGG CTG GGG CCC GTC CAG AAT GAG ATC 10 5 5 

Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Pro Val Gin Asn Glu lie 
340 345 350 

ACA CTG ACG CAC CCC ATC ACC AAG TAC ATT ATG GCT TGC ATG TCT GCG 1104 
Thr Leu Thr His Pro lie Thr Lys Tyr lie Mec Ala Cys Mec Ser Ala 
355 360 365 

GAC TTG GAG GTC ATT ACC AGC ACT TGG GTT CTG GTG GGG GGC GTT GTG 115 2 

Asp Leu Glu Val lie Thr Ser Thr Trp Val Leu Val Gly Gly Val Val 
370 375 330 

GCG GCC CTG GCG GCC TAC TGC TTG ACG GTG GGT TCG GTA GCC ATA GTC 12 0 0 

Ala Ala Leu Ala Ala Tyr Cys Leu Thr Val Gly Ser Val Ala lie Val 
385 390 355 400 

GGT AGG ATC ATC CTC TCT GGG AAA CCT GCC ATC ATT CCC GAT AGG GAG 124 3 

Gly Arg lie lie Leu Ser Gly Lys Pro Ala He He Pro Asp Arg Glu 
405 410 415 

GCA TTA TAC CAG CAA TTT GAT GAG ATG GAG GAG TGC TCG GCC TCG TTG 12 96 

Ala Leu Tyr Gin Gin Phe Asp Glu MeC Glu Giu Cys Ser Ala Ser Leu 
420 425 430 

CCC TAT ATG GAC GAG ACA CGT GCC ATT GCC GGA CAA TTC AAA GAG AAA 13 44 

Pro Tyr Met Asp Glu Thr Arg Ala He Ala Gly Gin Phe Lys Glu Lys 
435 440 445 

GTG CTC GGC TTC ATC AGC ACG ACC GGC CAG AAG GCT GAA ACT CTG AAG 13 92 

Val Leu Gly Phe He Ser Thr Thr Gly Gin Lys Ala Glu Thr Leu Lys 
450 455 450 

CCG GCA GCC ACG TCT GTG TGG AAC AAG GCT GAG CAG TTC TGG GCC ACA 14 4 0 

Pro Ala Ala Thr Ser Val Trp Asn Lys Ala Glu Gin Phe Trp Ala Thr 
465 470 475 430 
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(2) INFORMATION FOR SEQ ID NO: 270: 

(i) SEQtTS^rCZ CHARACTERISTICS: 

(A) LENGTH: 481 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: procein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 270: 

Thr lie Thr Thr Gly Ala Ser He Thr Tyr Ser Thr Tyr Gly Lys Phe 

IS 10 15 - 

Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp Val He lie Cys 
20 25 30 

Asp Glu Cys His Ser Gin Asp Ala Thr Thr He Leu Gly He Gly Thr 
35 40 45 

Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
50 55 60 

Thr Ala Thr Pro Pro Gly Ser Val Thr Thr ?ro H:.s Pro Asn He Glu 

65 70 75 80 

Glu Val Ala Leu Pro Gin Glu Gly Glu Val Pro Phe Tyr Gly Arg Ala 

85 SQ 95 

He Pro Leu Ala Phe He Lys Gly Gly Arg His Leu He Phe Cys His 
100 105 110 

Ser Lys Lys Lys Cys Asp Glu Leu Ala Lys Gin Leu Thr Ser Leu Gly 
115 120 125 

Val Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ala Val He Pro 
130 135 140 

Thr Ala Gly Asp Val Val Val Cys Ser Thr Asp Ala Leu Mez Thr Gly 
145 150 155 160 

Phe Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Ser Ala Val Thr 
165 170 175 

Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr Thr 
180 135 190 

Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly Arg Thr 
195 200 205 

Gly Arg Gly Arg His Gly He Tyr Arg Tyr Val Ser Ala Gly Glu Arg 
210 215 220 

Pro Ser Asp Met. Phe Asp Ser Val Val Leu Cys Glu Cys Tyr Asp Ala 
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22S 230 235 



240 



Gly Cys Ala Trp Tyr Asp Leu Thr Pro Ala Glu Thr Thr Val Arg Leu 
245 250 255 

Arg Ala Tyr lie Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu 
260 265 270 

Glu Phe Trp Glu Gly Val Phe Thr Gly Leu Thr Asn He Asp Ala His 
275 280 285 

Met Leu Ser Gin Thr Lys Gin Gly Gly Glu Asn Phe Pro Tyr Leu Val 
290 295 300 

Ala Tyr Gin Ala Thr Val Cys Val Arg Ala Lys Ala Pro Pro Pro Ser 
305 310 315 320 

Trp Asp Thr Met: Trp Lys Cys Met Leu Arg Leu Lys Pro Thr Leu Thr 
325 330 335 

Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Pro Val Gin Asn Glu He 
340 345 350 

Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala Cys Met Ser Ala 
355 350 365 

Asp Leu Glu Val He Thr Ser Thr Trp Val Leu Val Gly Gly Val Val 
370 375 380 

Ala Ala Leu Ala Ala Tyr Cys Leu Thr Val Gly Ser Val Ala He Val 
385 390 395 400 

Gly Arg He He Leu Ser Gly Lys Pro Ala He He Pro Asp Arg Glu 
405 410 415 

Ala Leu Tyr Gin Gin Phe Asp Glu Met Glu Glu Cys Ser Ala Ser Leu 
420 425 430 

Pro Tyr Met Asp Glu Thr Arg Ala He Ala Gly Gin Phe Lys Glu Lys 
435 440 445 

Val Leu Gly Phe He Ser Thr Thr Gly Gin Lys Ala Glu Thr Leu Lys 
450 455 460 

Pro Ala Ala Thr Ser Val Trp Asn Lys Ala Glu Gin Phe Trp Ala Thr 
465 470 475 480 

Tyr 
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